ConflictMetrics Documentation

Code Repositories

Clinical Trials Sponsor Dashboards and Explorer

Source data for the Clinical Trials Sponsorship Dashboards and Sponsorship Network Explorer have been drawn from the following repositories:

  • is the US National Institutes of Health's public database of information on certain clinical trials required, by law, to participate in the registry. There are approximately 280,000 registered trials in the database.
  • Drugs@FDA is the US Food and Drug Administration's public-facing database of drug products. The database provides information on most prescription and non-prescription drugs approved by the FDA since 1939.
  • provides data on the most-used drug products drawn from the annual Medical Expenditure Panel Survey conducted by the US Agency for Healthcare Research and Quality.

The Clinical Trials Sponsorship Dashboards provide access to sponsorship data from ClinicalTrials.Gov. The Drug Sponsorship Dashboard uses information from Drugs@FDA and in order to provide a sponsorship profile for most of the top 300 prescription drugs. The Disease/Condition Dashboard provides data on over 1600 of the most researched conditions.

  • The drug list was derived from's database of the top 300 drugs. Most long-approved over-the-counter drugs and vitamins were removed from the dataset. Drugs@FDA was then used to identify all additional trade and generic names for each product remaining.
  • The disease and condition list was derived using a custom instance of the database. An initial list of all 65,000+ conditions evaluated was drawn, but only conditions evaluated in over 25 registered trials were kept. Conditions entered under multiple names (e.g., Type II Diabetes; Type 2 Diabetes; and Diabetes, Type II) were combined before analysis. However subtypes (e.g. depression vs. major depression) were treated as separate conditions.
  • Drug and disease lists were submitted to the custom ClinicalTrials.Gov database to locate all registered trials for each product or condition. For each identified trial, the list of sponsors and co-sponsors were identified and classified using's agency type taxonomy: Industry, U.S. Federal Government or NIH, or Other.
  • Network visualizations were prepared using the igraph package for R. Network plots were created using appropriate force-directed algorithms. Fuchterman-Reingold was used for plots with fewer than 1000 nodes, and DrL was used for those with more than 1000 nodes. Users can experiment with additional layout algorithms using the Sponsorship Network Explorer.

Conflict Network Explorer

Source data for the Conflict Network Explorer come from:

  • PubMed is a free, publicly funded search engine of over 29 million biomedical and life science research article abstracts, primarily indexed in MEDLINE and maintained by the National Center for Biotechnology Information (NCBI) at the NIH’s National Library of Medicine (NLM). Starting in March 2017, PubMed began including conflict of interest statements below article abstracts in their search results page, whenever publishers provide these statements. This move answers calls from physicians and researchers for greater transparency in COI data, though not all publishers provide this information, and COI statements are available for only a portion of the articles on PubMed.

The Conflict Network Explorer provide access to conflicts of interest data from Pubmed.

  • Conflicts of interest disclosure statements gathered from PubMed were parsed using a hybrid machine learning/dictionary-based natural language processing approach. The conflict analyzer/classifier identifies individual conflicts in each disclosure statement and assigns a weight (low, medium, high) to each financial relationship. Low-weight conflicts include travel fees, lectureship fees, honoraria, etc. Medium-weight conflicts include grants and contract support. High-weight conflicts include employment in the pharmaceuticals industry and stock ownership. The provided C-Score is an aggregate measure of the conflicts for each entity. Note: The reliability of the conflict analyzer/classifier was evaluated against a human coded random sample (n=1000). Reliability was generally good at ICC=.782 for low level conflicts, .792 for medium level conflicts, and .769 for high weight conflicts.
  • Identified conflicts were prepared for visualization using the visNetwork package for R. This package uses the vis.js javascript library and allows for user interactivity controls for display.

Interpreting Dashboard Network Visualizations

Sponsor and conflict network diagrams are designed to give users a general sense of the funding networks that make up clinical research for a specific drug product or clinical condition. The diagrams detail the flow of money between funders and researchers.

  • Each articulation (line) in the network diagram represents a line of funding.
  • Each node (dot) represents a funder or entity receiving funding. Depending on the diagram, funders may include pharmaceuticals companies or grant funding agencies. Recipients may include trials, authors, journals, or articles.
  • The larger the node, the more funding relationships it has.
  • The more central the node, the more important it is to the overall research network for that drug or condition.