More than half of biomedical findings cannot be reproduced – we urgently need a way to ensure that discoveries are properly checked
REPRODUCIBILITY is the cornerstone of science. What we hold as definitive scientific fact has been tested over and over again. Even when a fact has been tested in this way, it may still be superseded by new knowledge. Newtonian mechanics became a special case of Einstein’s general relativity; molecular biology’s mantra “one gene, one protein” became a special case of DNA transcription and translation.
One goal of scientific publication is to share results in enough detail to allow other research teams to reproduce them and build on them. However, many recent reports have raised the alarm that a shocking amount of the published literature in fields ranging from cancer biology to psychology is not reproducible.
Pharmaceuticals company Bayer, for example, recently revealed that it fails to replicate about two-thirds of published studies identifying possible drug targets (Nature Reviews Drug Discovery, vol 10, p 712).
Bayer’s rival Amgen reported an even higher rate of failure – over the past decade its oncology and haematology researchers could not replicate 47 of 53 highly promising results they examined (Nature, vol 483, p 531). Because drug companies scour the scientific literature for promising leads, this is a good way to estimate how much biomedical research cannot be replicated. The answer: the majority.
The reasons for this are myriad. The natural world is complex, and experimental methods do not always capture all possible variables. Funding is limited and the need to publish quickly is increasing.
There are human factors, too. The pressure to cut corners, to see what one wants and believes to be true, to extract a positive outcome from months or years of hard work, and the impossibility of being an expert in all the experimental techniques required in a high-impact paper are all contributing factors.
The cost of this failure is high. As I have experienced at first hand as a researcher, attempts to reproduce others’ published findings can be expensive and frustrating. Drug companies have spent vast amounts of time and money trying and failing to reproduce potential drug targets reported in the scientific literature – resources that should have contributed towards curing diseases.
Failed replications also quite often go unpublished, thereby leading others to repeat the same failed efforts. In the modern fast-paced world, the normal self-correcting process of science is too slow and too inefficient to continue unaided.
Many have wrung their hands and proposed various penalties for scientific studies that cannot be reproduced. But instead of punishing investigators, what if there was a way of rewarding them for pursuing independent replication of their most significant scientific results – the ones they want to see cited and built on – before or shortly after publication? I believe this could be a substantial boon to science and society, which is why I started the Reproducibility Initiative.
I am the co-founder and CEO of Science Exchange, part of the initiative. It is an online marketplace to connect scientific services, such as DNA sequencing, with people who need them. The exchange lists more than 1000 experts in techniques including sequencing, electron microscopy and mass spectrometry. They mostly provide services to their own institute, but are open to other work on a fee-paying basis.
Thinking about the reproducibility problem, I realised that Science Exchange could help by providing investigators with the means and incentives to obtain independent validation of their results.
Here’s how it works. Scientists submit studies to us that they would like to see replicated. Our independent scientific advisory board – all members of which are leaders in their fields as well as advocates on the reproducibility problem – selects studies for replication. Service providers are then selected at random to conduct the experiments, and the results are returned to the original investigators, who can then publish them in a special issue of the open-access journal PLoS ONE. We will issue a “certificate of reproducibility” for studies that are successfully replicated.
In our pilot phase, we expect to attempt to replicate 40 to 50 studies. We also plan to publish an analysis of the overall success of what is essentially an experiment in reproducibility.
Initially, investigators must bear the cost of replications, which we estimate will be approximately one-tenth the cost of the original study. If we are successful, we believe funders will eventually see the value of supporting these replication studies. In fact, we are in discussions with numerous public and private funders who believe our mechanism may meet their own acknowledged need for independent validation.
We hypothesise that the success rate for replications will be quite high, mainly because investigators will submit studies that they are confident can be replicated. And that is one of the points we want to make – we want to identify the most robust, important findings and mark them in a highly visible way.
What we are not doing – a point that many have misunderstood – is trying to police the entire scientific literature. Nor are we calling for a doubling of the budgets required to repeat every experiment, every time. We also won’t demand the publication of reproducibility failures – although, for obvious reasons, we and PLoS encourage investigators to publish all outcomes.
Our goal is to provide a much-needed imprimatur of robustness that will ultimately increase the efficiency of research and development and bring us one step closer to perfecting the scientific method, for the benefit of all.
Elizabeth Iorns is co-founder and CEO of Science Exchange, based in Palo Alto, California. For more information, visit reproducibilityinitiative.org
If you would like to reuse any content from New Scientist, either in print or online, please contact the syndication department first for permission. New Scientist does not own rights to photos, but there are a variety of licensing options available for use of articles and graphics we own the copyright to.
Have your say
Only subscribers may leave comments on this article. Please log in.
Only personal subscribers may leave comments on this article
??
Mon Sep 17 11:06:29 BST 2012 by Owen
I don’t understand what the motivation is for investigators to have their work verified – if they have had their work published and turned into drugs, then why would they bother getting it checked? They may have an ‘acknowledged need for independant validation’ but if it is going to cost them money then they will want to see the benefit and I couldn’t quite make out what that was from this article.
Also: ‘a shocking amount of the published literature … is not reproducible.’ Thats really not shocking, the level of corruption in the pharmaceutical and healthcare industries is fairly well known.
??
Mon Sep 17 12:35:40 BST 2012 by Duncan McKenzie
Perhaps I’m naive, but I doubt most drug companies are so corrupt that they would skew primary research in order to produce ineffective drugs. (There is a stronger incentive to hide ineffectiveness after a fortune has been spent on a drug’s development.) Statistics alone are enough to produce a slew of “highly promising” drugs that do nothing, provided that null results remain unpublished. Suppose 2000 drug candidates are, in fact, totally ineffective. Given p-values of 0.05, and a culture where negative or null results are not published, the literature will produce 100 promising drugs. When the studies are reproduced, five of these will be “confirmed” as effective, even though they are not. To account for this kind of selection bias, researchers would need to carry out more rigorous testing than are suggested by the P-values alone.
A second problem is that the litmus test in medicine is often “does it make the patient feel better than a placebo”. P-values show that a result is not due to chance, but do not distinguish between medical effectiveness and bias or human error. You only have to look at the astonishing effectiveness of placebos to see how much human factors influence results. A researcher’s sincere belief that a drug will work may be enough to influence the patient, or the researcher’s interpretation of the patient’s responses. But sincerity is not science. There needs to be more of Feynman’s “bending over backwards to prove oneself wrong”. One improvement might be to ensure that medical tests are conducted to the same standards as the scientific community demands from parapsychology tests, and are thus immune from the same mundane influences. Drug tests should be double blind, with efforts taken to ensure that drugs and placebos appear identical, and are tested in identical circumstances. The same could be applied to lab research (are researchers unconsciously influencing the survival of certain white mice by their actions.) Of course, this happens very rarely, if ever
All comments should respect the New Scientist House Rules. If you think a particular comment breaks these rules then please use the “Report” link in that comment to report it to us.
If you are having a technical problem posting a comment, please contact technical support.
Views: 0