Say you are a public health official concerned about antibiotics being overprescribed. Or you work for an NGO aiming to improve sanitation in developing countries. Or you are a member of a government behavioral insight team working to increase collection of real estate taxes. Or maybe you are like me: a social scientist wanting to contribute to our knowledge base by replicating and extending previous research. Whatever your issue, whether you are evaluating past findings with an eye to academic publishing or to applied work, you want to know that the evidence is solid.
Unfortunately, at the moment there is reason to believe that the number of false positives— claims that some effect exists when in fact it does not—in the literature is a lot higher than expected. For example, a large-scale replication project suggested that as many as64 percent of studies from psychology may fail to replicate. Inflated false positive rates plague other fields as well. In other words, the science is not as solid as it seems. This should probably worry you, like it has worried me.
Registered reports are among the proposed solutions. A registered report (RR) is a publication format in which a paper is submitted to a journal for peer review before the data have been collected. The editors and peer reviewers evaluate the theoretical background, the study design, and the statistical analysis plan, and then accept or reject the paper on this basis. By making the review process and decisions about publication independent of the study’s results, and “locking down” the study’s rationale and plan before data is collected, RRs thus address two big causes of false-positive inflation: publication bias and HARKing, or “hypothesizing after the results are known.”
Whether you are evaluating past findings with an eye to academic publishing or to applied work, you want to know that the evidence is solid.
Publication bias is the tendency for academic journals to preferentially publish positive results (rather than negative, or null, results). This means many researchers leave negative results in their metaphorical file drawers, rather than attempt to publish them. With only a subset of findings published, our understanding of an issue can be skewed.
HARKing happens when researchers, consciously or unconsciously, present an unexpected positive result as if it was predicted from the start. Academic journals’ bias toward positive results, and the link between publishing and career advancement, combine to incentivize HARKing.
I first tried the registered report format thanks to my colleague, social psychologist Mark Brandt. Last year, we were planning a study to replicate earlier research on the relationship between morality and possibility. We were also extending this research to a new context and had some fairly reasonable (we thought!) hypotheses about how it would all pan out. Unrelatedly, around this time Mark read a blogpost of mine about my vague sense of trepidation with RRs, explaining why I hadn’t dared try them. He then suggested that our planned study would be a good candidate for the RR format, and, despite the slight apprehension, I agreed.
We pilot tested our materials, designed the study, and wrote up our analysis code; then we wrote a manuscript outlining our reasoning, methods, and plans for analysis. So far so good! We submitted the manuscript to the Journal of Experimental Social Psychology, where after a round of reviews and revisions, our study received an “in principle acceptance.” In other words, we were able to convince the reviewers and the editor that the question we were asking was important enough, and our methods strong enough, that the results would be worth publishing regardless of whether they turned out how we predicted or not. So here’s the first benefit of RRs: by shifting the focus from results to research questions and methods, RRs reduce publication bias. All RRs end up in the literature in some form. Even if an author decides to withdraw the manuscript after data collection, there will be a record of what they had planned to do.
Even if you’re not publishing in an academic journal, it is beneficial to consider how you can design and evaluate your studies with similar guards against changing the story once the results come in.
After the in-principle acceptance, Mark and I then collected and analyzed the data as planned. Lo and behold, the results were not quite what we expected! We replicated the original finding, but the context we added—war, compared to peace—did not influence the pattern of results how we thought it would (read more about the findings here). In the past, if this had been a traditional manuscript, not yet submitted to a journal, the temptation (and perhaps even recommendation) would have been to reconsider our reasoning and update our explanation accordingly—weaving a convincing narrative to make the manuscript coherent and as if it confirmed our expectations. In other words, to HARK.
Instead, we did our best to explain why we thought the results had come out the way they did (despite our predictions to the contrary), and what this might mean for the theory and findings we were building on. In our case, this also meant considering whether our result might represent a false negative—although we had a large sample, the design was complex and our power analyses limited, so we could have failed to detect a real effect. Because the study was an RR, all of this could be laid out, as clearly as possible, for the reader to evaluate.
So here’s the second benefit of RRs: researchers are not allowed to rewrite history to be more convincing or sound more prescient. For readers of RRs, the downside is that if the results don’t come out the way the researchers initially predicted, you might have to work a little bit harder to digest a less coherent narrative or manuscript. This is because you will be dealing with the same uncertainty that the authors did—but you’ll also be better able to evaluate the evidence provided by the study. In other words, you will have a true picture of uncertainty rather than a potentially false picture of certainty.
By using a registered report style process in your company or policy office, it allows you to be judged on the process, not the results.
Over time, as RRs become more popular, they will increase the reliability of the scientific literature. By reducing publication bias, they will most likely lead to more nulls being published (than the current estimates of 5-20 percent), and readers will be able to access the reasoning and literature behind—and learn from—these nulls as well. And when you, as an applied researcher, encounter a predicted, positive result published as an RR, you can feel much more confident that it has not been distorted through HARKing.
Finally, as an applied researcher, your own experiments may benefit from a registered report style process. Even if you’re not publishing in an academic journal, it is beneficial to consider how you can design and evaluate your studies with similar guards against changing the story once the results come in. For academics, the career incentives to publish novel and positive findings are high, which no doubt contributed to the rates of false positives. I can imagine that applied researchers face similar pressures to get the “right” results from their managers in business or politicians if working in government. By using a registered report style process in your company or policy office, it allows you to be judged on the process, not the results. Depending on the field in which you work, there are many tools available to get you started: the Registry for International Development Impact Evaluations (RIDIE) and the American Economic Association’s registry for Randomized Control Trials (ACA RCT) are two examples; and the Open Science Framework provides resources to build your own.