Weak Statistical Standards Implicated in Scientific Irreproducibility
By Erika Check Hayden,
Nature
| 11. 11. 2013
The
plague of non-reproducibility in science may be mostly due to scientists’ use of weak statistical tests, as shown by an innovative method developed by statistician Valen Johnson, at Texas A&M University in College Station.
Johnson compared the strength of two types of tests: frequentist tests, which measure how unlikely a finding is to occur by chance, and Bayesian tests, which measure the likelihood that a particular hypothesis is correct given data collected in the study. The strength of the results given by these two types of tests had not been compared before, because they ask slightly different types of questions.
So Johnson developed a method that makes the results given by the tests — the P value in the frequentist paradigm, and the Bayes factor in the Bayesian paradigm — directly comparable. Unlike frequentist tests, which use objective calculations to reject a null hypothesis, Bayesian tests require the tester to define an alternative hypothesis to be tested — a subjective process. But Johnson developed a 'uniformly most powerful' Bayesian test that defines the alternative hypothesis in a standard way...
Related Articles
By Samuelle Fajutrao Falk , The Conversation | 06.26.2026
When my colleagues and I asked autistic people and parents of autistic children in Sweden how they feel about genetic research in autism, one response stood out: “I hope genetic research finds new ways to help us, not erase us.”...
By Anna Rogers, Mother Jones | 06.19.2026
By Marisa Flook , BioNews | 06.29.2026
An anti-ageing gene therapy not approved by the US Food and Drug Administration (FDA) is set to be offered by an American company at overseas clinics outside of US jurisdiction.
The treatment, developed by Minicircle from Austin, Texas, uses a...
By Georgia Michelman, Science | 06.18.2026