Reducing Significance Threshold: Debate Over Scientific Impact

Share this content:
Statistical significance set at <i>P</i> <.05 results in high rates of false-positives, "even in the absence of other experimental, procedural and reporting problems."
Statistical significance set at P <.05 results in high rates of false-positives, "even in the absence of other experimental, procedural and reporting problems."

In a controversial and divisive article posted July 22, 2017 on the preprint server PsyArXiv, a group of 72 well-established researchers from the same number of institutions across the United States, Europe, Canada, and Australia in departments as diverse as psychology, statistics, social sciences, and economics, led by Daniel J. Benjamin, PhD, from the Center for Economic and Social Research and Department of Economics, University of Southern California, Los Angeles, propose to improve "statistical standards of evidence," by lowering the P-value for significance from P <.05 to P <.005 in the fields of biomedical and social sciences.1 This article was published in September 2017 as a comment in Nature Human Behavior.2

Statistical significance set at P <.05 results in high rates of false-positives, note the authors, "even in the absence of other experimental, procedural and reporting problems," and may underlie commonly encountered issues of lack of reproducibility.

In an open science collaboration published in Science in August 2015, 270 psychologists seeking to assess reproducibility in their field endeavored to replicate a total of 100 studies published in 3 high-impact factor journals in psychology during 2008.3

"Reproducibility is a defining feature of science," remarked the investigators in the Science article's introduction. Reproducibility was assessed using 5 parameters: "significance and P values, effect sizes, subjective assessments of replication teams, and meta-analysis of effect sizes."

Surprisingly, the researchers found that "replication effects were half the magnitude of original effects, representing a substantial decline." Replications led to significant results in just 36% of studies, and "47% of original effect sizes were in the 95% confidence interval of the replication effect size." They conclude that "variation in the strength of initial evidence (such as original P value) was more predictive of replication success than variation in the characteristics of the teams conducting the research."

However, in a comment on this large-scale replication study published several months later, also in Science, by Daniel T. Gilbert, PhD, professor of psychology at Harvard University, Cambridge, Massachusetts, and colleagues, the psychologists argue that this article "contains 3 statistical errors, and provides no support for [the low rate of reproducibility in psychology studies]."4 The comment's authors argue that, because results from the replication study were not corrected for error, power, or bias, "the data are consistent with the opposite conclusion, namely, that the reproducibility of psychological science is quite high."

Page 1 of 4

Related Resources

You must be a registered member of Cancer Therapy Advisor to post a comment.

Regimen and Drug Listings

GET FULL LISTINGS OF TREATMENT Regimens and Drug INFORMATION

Bone Cancer Regimens Drugs
Brain Cancer Regimens Drugs
Breast Cancer Regimens Drugs
Endocrine Cancer Regimens Drugs
Gastrointestinal Cancer Regimens Drugs
Gynecologic Cancer Regimens Drugs
Head and Neck Cancer Regimens Drugs
Hematologic Cancer Regimens Drugs
Lung Cancer Regimens Drugs
Other Cancers Regimens
Prostate Cancer Regimens Drugs
Rare Cancers Regimens
Renal Cell Carcinoma Regimens Drugs
Skin Cancer Regimens Drugs
Urologic Cancers Regimens Drugs

Sign Up for Free e-newsletters