Systemic Bias in Clinical Research
Reproducibility is the foundation of science. To paraphrase the 17th century Irish chemist Robert Boyle, it is reproducibility that produces facts. It is the ability to repeat the same experiment over and over again that makes facts believable.
Numerous commentators including several from the leading academic institutions, the National Institutes of Health (NIH) and the Food and Drug Administration (FDA) have raised an alarm regarding the declining rate of reproducibility in modern clinical research and its implications for health and science.
Francis S. Collins, M.D., Ph.D. and Lawrence Tabak, D.D.S., Ph.D., NIH director and principal deputy director, respectively, cited in their 2014 Nature paper a 2011 study by the Office of Research Integrity of the U.S. Department of Health and Human Services that documented 12 cases of clinical study irreproducibility.
In 2015, psychology became the first scientific discipline to conduct and publish an open, registered empirical study of reproducibility called the Reproducibility Project which was led by Brian Nosek, Ph.D., and the Center for Open Science. Two hundred seventy researchers from around the world collaborated to attempt to replicate 100 empirical studies from three top psychology journals.
Of the original studies, 97% reported significant results (p<.05) while in the replication studies only 36% reported significant results. Same studies, but conducted by independent labs!
Nosek noted: “In sum, a large portion of replications did not reproduce evidence supporting the original results despite using high powered designs and original materials when available. The open dataset provides a basis for hypothesis generation on the causes of irreproducibility.”
John Ioannidis, M.D., D.Sc., Professor of Medicine and of Health Research and Policy at Stanford University School of Medicine and a Professor of Statistics wrote in his seminal paper “Why Most Published Research Findings are False”:
“Several methodologists have pointed out   that the high rate of nonreplication (lack of confirmation) of research discoveries is a consequence of the convenient, yet ill-founded strategy of claiming conclusive research findings solely on the basis of a single study assessed by formal statistical significance, typically for a p-value less than 0.05.”
“Research is not most appropriately represented and summarized by p-values, but, unfortunately, there is a widespread notion that medical research articles should be interpreted based only on p-values.”
On March 7, 2016, the American Statistical Association (ASA), the world’s largest community of statisticians and the oldest continuously operating professional science society in the United States, released its first statistical practice guidance document in history. The paper titled “American Statistical Association Releases Statement on Statistical Significance and P-Values”, cautioned that scientific journal editors were becoming overly dependent on p-value as a gatekeeper for whether research is publishable.
Said Jessica Utts, Ph.D., ASA president about the over reliance on p-value: “This apparent editorial bias leads to the ‘file-drawer effect,’ in which research with statistically significant outcomes are much more likely to get published, while other work that might well be just as important scientifically is never seen in print. It also leads to practices called by such names as ‘p-hacking’ and ‘data dredging’ that emphasize the search for small p-values over other statistical and scientific reasoning.”
The central hypothesis of this editorial is that the economic framework in which clinical research exists is poorly suited for scientific inquiry and plays a little understood but powerful role in increasing bias and diminishing clinical study reproducibility.
The list of biases which have been detected by numerous researchers include, for example, selection bias, information bias, confirmation bias, publication bias, outcome bias, p-Value < 0.05 bias and more.
Ioannidis, in a June 11, 2014 lecture “Reproducible Research: True or False?” declared: “One might argue that maybe there is more pressure in the U.S. for an academic researcher, more pressure to deliver an extreme finding in order to get it published in a major journal and get funding.”
Based on the data, we have concluded that the economics of the current academic research system, of which peer review journal publishing is central, is probably the most powerful variable contributing to clinical study bias.
Too Many Studies, Too Little Infrastructure
There are currently (1/22/2017) 234,797 registered studies with the NIH’s clinicaltrials.gov. That is a record high. At the end of 2015 that number was 183,036, also a record. In fact, the number of clinical studies registered with clinicaltrials.gov has set 17 consecutive annual records.
Of those studies, 56% were recruited outside the U.S., 39% were recruited solely in the U.S. and the 5% remaining were recruited from both the U.S. and OUS.
Since 2000, when 3,966 studies were registered at year end, the number has grown exponentially [Chart 1 above].  If these trends continue at the same rate as the last ten years, the number of clinical studies registered with clinicaltrials.gov will exceed 400,000 by 2025, a mere eight years from now.
Although the rise in the number of clinical studies registered on clinicaltrials.gov is due, in part, to rising pharma and med device demand for clinical studies, the majority of growth has been a result of the International Committee of Medical Journal Editors (ICMJE) and the U.S. Food and Drug Administration Amendments Act of 2007, which required that all drug, biologic and device studies be registered with clinicaltrials.gov to be published in peer review journals.
As Chart I illustrates, the pace of clinical study registration on clinicaltrials.gov accelerated sharply in 2004 and again in 2007.
All the studies listed on clinicaltrials.gov must be peer reviewed in order to be published.
Peer review is labor-intensive. In 2015, per estimates from Kovanis, et al. peer reviewers expended approximately 63.4 million hours to review, and 30% of those hours (18.9 million) were provided by the top 5% contributing, largely volunteer reviewers.
Finally, said Kovanis, if the peer-review effort were split equally among researchers, it would generate a demand for 1.4 to 4.2 reviews per researcher per year.
The Bottleneck Scalability Problem
The number of clinical studies for publication in peer review journals has increased faster than the number of peer review journals. Michael Ware, in his annual report on the state of scientific journals, wrote that the number of peer review journals had increased by 3% to 3.5% each year for decades. By contrast, the number of clinical studies registered with clinicaltrials.gov has increased by 28% per year since 2000.
Given the labor-intensive nature of the peer review process, the comparatively slower growth rate of peer review journals calls into question the ability of the current system to accommodate that growth without affecting quality or other performance criteria.
In short, is the peer review journal system as currently configured scalable?
Journal Profit Margins Rise
It has been a dozen years since the ICMJE required peer review studies be registered with clinicaltrials.gov. Within the current for-profit publishing model, that has had the (perhaps) intended consequence of raising publishing profit margins while also creating the (perhaps) un-intended consequence of bottlenecks for clinical research.
As is illustrated in the table below, the profit margins for Elsevier and Wolters Klower, two of the largest publishers of peer review journals, have risen in the dozen years since the ICMJE proposal in 2004.
From a reported profit standpoint, more articles per journal and more peer review hours per volunteer has not hurt scientific publisher profit margins.
Apart from the effect on publishers, the question must also be asked about the effect of scalability on individual clinical investigators.
Fake Peer Reviews Into the Bottleneck
In 2015 BioMed Central staff members launched an investigation of 50 papers that seemed suspiciously to have been the subject of fake peer reviews. Some of the reviews attached to those studies came from third-party companies selling their “review services.”
Fake peer reviews are unfortunately a growing problem for publishers. The Committee on Publication Ethics (COPE) has been approached by myriad publishers concerned about fake reviews. These publishers cite manipulations from authors submitting their own fake reviews and authors purchasing “manuscript preparation services” complete with fake peer reviews. From 2012 – 2014, Nature reports that journals have retracted over 110 papers due to faked peer reviews.
Given the sheer volume of clinical research trying to squeeze into a comparatively fixed number of peer review journal openings, it should not be surprising that attempts to circumvent the system like this emerge.
Review Quality vs Quantity
For the individual clinical investigator, another relevant performance measure is reproducibility.
If the proposition is correct that individual actors in the peer review system are creating, reviewing and publishing significantly more research today than in prior years then it raises the very legitimate question of quality vs. quantity.
‘Quality’ and ‘quantity’ are not independent variables.
We argue, in fact, that in the absence of new tools, the relationship between the quantity of clinical research and its quality is an inverse one.
Setting an Economic Framework for Improved Scientific Research Quality
As a thought experiment, imagine the following for creating and publishing scientific research:
- All title to clinical research remain with the researchers themselves and does not pass to the publishers.
- Academic researchers organize themselves into an organization of shared values, ethics, standards, and operate the organization for the benefit of its members.
- All profits from the publication of scientific research return to researchers to fund further research.
- This organization of shared standards owns and operates the peer review scientific journals.
Contrast that with the current system:
- Title to research passes to a for-profit enterprise which operates solely for the benefit of its shareholders.
- The revenue generated by publishers from clinical study research is approximately $10 billion annually. None of this funds more research.
- Despite the well documented issues of irreproducibility and bias, there is no apparent financial incentive at the publisher level to change the system.
- Strategies for improving the current system emanate primarily from “good research practices” groups, not from the for-profit publishers of this research.
Finding an Optimal Business Structure for Academic Research – The Research CO-OP Model
A key question that runs through economics is the nature of the optimal form of business structure. Does the current academic research publication business structure suit the academician and the clinical researcher?
Consider a co-op business model for academic research.
Co-ops are autonomous associations of people united through common economic, social and academic needs and aspirations in a jointly owned and democratically controlled business.
One billion people in the world are members of at least one cooperative. The 300 largest cooperatives in the world generate about $2.2 trillion in sales. If these 300 co-ops were a country, they would be the seventh largest.
For-profit enterprises (like the scientific journal publishers) seek to maximize shareholder returns through profit taking. By contrast, co-ops seek to maximize member cooperation, education and to return profits to its members. (Mooney, et al. 1996).
As for-profit organizations seek to extract the maximum return from its suppliers and customers, co-ops aim to optimize the returns to both its members and its own operations (Bontems and Fulton 2009).
For-profit organizations separate customers and suppliers and shareholders. In the co-op model, the member is both a patron (customer/supplier) and owner (shareholder). Are not academic researchers both suppliers and consumers of published research?
Cooperative businesses are typically more economically resilient than many other forms of enterprise, with twice the number of co-operatives (80%) surviving their first five years compared with other business ownership models (41%).
Changing the business model, changes the incentives. How would the following effect issues like bias and study reproducibility?
- Shared goals: people express mutual needs that translate into common goals.
- Shared values: people feel a sense of duty to participate as an expression of common values.
- Sense of community: people identify with and care about other people who either live in the same area or are like them in some respect.
We would argue that the best starting point for addressing the issues of bias and reproducibility is a framework of incentives that support that effort.
The current for-profit framework is, at best, neutral to the effort to reduce bias and increase reproducibility and, at worst, benefiting financially from a higher volume, lower quality clinical research system.
 Collins, Tabak, NIH Plans to Enhance Reproducibility, Nature. 2014 Jan 30; 505(7485): 612-613
 Open Science Collaboration (2015). "Estimating the reproducibility of Psychological Science". Science. 349 (6251): aac4716. doi:10.1126/science.aac4716. PMID 26315443.
 Ioannidis JPA (2005) Why most published research findings are false. PLoS Med 2(8): e124.
 Sterne JA, Davey Smith G. Sifting the evidence—What's wrong with significance tests. BMJ. 2001;322:226–231.
 Wacholder S, Chanock S, Garcia-Closas M, Elghormli L, Rothman N. Assessing the probability that a positive report is false: An approach for molecular epidemiology studies. J Natl Cancer Inst. 2004;96:434–442.
 Risch NJ. Searching for genetic determinants in the new millennium. Nature. 2000;405:847–856
 Clinicaltrials.gov January 22, 2017
 Kovanis M, Porcher R, Ravaud P, Trinquart L (2016) The Global Burden of Journal Peer Review in the Biomedical Literature: Strong Imbalance in the Collective Enterprise. PLoS ONE 11(11): e0166387. doi:10.1371/journal.pone.0166387
 The full list of 2015 retracted papers is available here.
 Ware. November 2012. The STM Report. An overview of scientific and scholarly journal publishing. Mark Ware Consulting and Outsell, Inc.