Polanin, J. R., Tanner-Smith, E. E., & Hennessy, E. A. (2016). Estimating the difference between published and unpublished effect sizes: A meta-review. Review of Educational Research, 86, 207–236.
Publication bias refers to a phenomenon in which the results of an experiment affect the probability that it will be published in a scholarly journal. Researchers, peer reviewers, and journal editors demonstrate bias toward “interesting” findings, often in the form of statistically significant results. Such biases are understandable. As researchers, we are more enthusiastic when results support our hypotheses (We were right!). Additionally, when a study includes a statistically significant result, it is easy to interpret the experiment or study as having “worked.” For journal editors and reviewers, a study is more likely to draw readers if it includes a surprising result. Inconclusive or null results (i.e., not statistically significant results) leave many open questions, among them the following:
Such questions are inherently difficult to answer. The uncertainty that comes with inconclusive results can discourage publication and contribute to publication bias.
Science is a cumulative activity; no single study is definitive. Instead, the results of multiple studies should be interpreted as converging toward the truth as more data are collected and more studies are completed. Meta-analysis represents an important type of study in this cumulative process. Meta-analyses are studies that combine the results from multiple individual studies investigating the same or similar phenomena or treatments. By combining results across multiple individual studies with different samples and measures and carefully weighting those results to account for important factors such as sample size, meta-analysis is a powerful way to increase the generalizability of scientific findings.
However, publication bias represents an important challenge to the validity of meta-analysis. If null or inconclusive results are not published routinely, a meta-analysis that does not include unpublished results will produce a biased estimate—that is, it will likely overstate the strength of a relation or the effectiveness of a treatment. This affects the conclusions drawn by informed consumers of research, including practitioners, policymakers, and scientists who (rightly) place great value in the findings of meta-analyses.
Polanin, Tanner-Smith, and Hennessy (2016) reviewed 81 meta-analyses published in two prominent journals in education and psychology in an attempt to estimate the magnitude of bias in published research. Importantly, each of the 81 meta-analyses included published studies and unpublished studies (often called gray literature). Their goal was to empirically determine whether there was evidence for bias in the estimates emerging from published literature by comparing treatment effect estimates for published and unpublished studies.
In this meta-analysis of meta-analyses, the authors calculated effect sizes for 81 meta-analyses, encompassing more than 6,000 studies. The results of their analyses provide strong evidence for publication bias. On average, published studies reported much larger effect sizes than unpublished studies (d = +0.18). To understand the magnitude of this bias, consider that the mean meta-analytic effect for standardized measures in a recent meta-analysis of reading interventions for students in grades 4–12 was .21. An effect size difference of 0.18 is substantial and may meaningfully alter the conclusions one might draw from experimental research.
The evidence for publication bias documented in Polanin et al. (2016) is compelling—and disconcerting. Clearly, researchers in the fields of education and psychology and publication gatekeepers need to recommit to efforts to combat publication bias. Well-designed studies that produce null or inconclusive results are of interest and contribute to the cumulative activity of building scientific knowledge. Journal editors, in particular, must guard against dismissing studies because the results were inconclusive or null. If researchers wish to summarize literature via meta-analysis or systematic literature reviews, the findings of Polanin and colleagues highlight the critical need to search for and include gray literature. Similarly, scientists who choose not to publish the results of specific experiments should make efforts to make their results available through other means, such as website repositories.
For the general public, critical findings about publication bias, or the much publicized replication crisis in psychology, can shake confidence in research findings. Although it may surprise many with less connection to the scientific community, most scientists do not discourage skepticism. Research results—particularly results of a single study—should be viewed with healthy skepticism. However, skepticism should not lead to wholesale rejection of the value of scientific inquiry. Rigorous experimental research has contributed to a robust and growing body of knowledge about how children learn and how we as educators can teach them. Knowledgeable consumers of research should seek out rigorous meta-analyses and literature reviews (which frequently include comprehensive searches of gray literature), as these summaries include multiple studies and represent a greater approximation of the truth.
In education, there have been concerted efforts to aggregate and summarize research findings across studies, most notably within the What Works Clearinghouse (WWC), an initiative by the Institute of Education Sciences. The WWC is not without flaws; my colleague Dr. Jack Fletcher published an important critique of WWC methodology in a previous column on this website. However, summaries emerging from the WWC and similar initiatives should permit research consumers greater confidence in their conclusions than single studies because these efforts combine results from diverse studies. In particular, the findings of WWC Practice Guides represent an excellent resource for those seeking to align educational practices with what is known from research. These guides routinely include unpublished gray literature, a systematic and rigorous study review, and explicit ratings of the confidence of findings.