Skip to 0 minutes and 0 secondsResults are going missing somehow. There’s a set of results that get published. And there’s a set of results that go missing. And it’s very difficult to get a handle on this. How do we aggregate all the information across the thousands of active researchers in any given area? Everything they thought of, everything they tested on their computer. I’m not the NSA. I can’t tap into everybody’s computer and see everything they’ve done. But you’d almost have to do that to know everything they’ve run. And that’s impossible. The Franco et al. study is interesting precisely because it gives us a glimpse that we usually don’t have into publication bias. So this is what they do.
Skip to 0 minutes and 37 secondsThey’re able to take advantage of what are called the TESS experiments. Time Sharing Experiments for the Social Sciences. So the TESS experiments are funded by NSF. It’s a competitive program that anybody can apply for. And I’ll show you the statistics soon. There’s hundreds of studies that have been done here. And what the TESS experiments do is they obtain as representative samples as they can of Americans who use computers. It’s basically an Internet-based sample. But they get a very large representative sample of Americans and they allow you to run survey experiments. In all the TESS proposals, you have to include statistical power calculations. So, you know, some people do those better than others.
Skip to 1 minute and 21 secondsBut at least those are all adequately powered studies. So again, we have high quality studies that go through this refereeing process, using experimental designs, nationally representative samples that are adequately powered, on questions that reviewers thought were of scholarly interest. These are like – I wanna do a test study. What have I been waiting for my whole career? I’m gonna go do one of these. These sound amazing. And not surprisingly, it’s competitive to get this funding. So there’s a funding process where people put in proposals, and the best proposals get funded to add questions. What’s unique here is we know everything that got funded. So we know all of the 100-something, there’s 200-something studies that were funded.
Skip to 2 minutes and 5 secondsWe know what the title of their study was. We know who the authors are. So what Franco et al. do is they figure out what happened to all these studies. Were they published? Were they not published? What were the results? And they’re very interested in how the nature of the results affected publication. A little bit more detail. The biggest fields are political science and psychology, out of the 200-something studies. There’s some economics. Quite a bit of sociology. But you know, it’s a nice sort of social science-wide set of experiments. And their main question is gonna be, “How does the strength of a result affect its publication?” So you know, the big concern is are no results suppressed?
Skip to 2 minutes and 48 secondsAnd strong significant results published. And they’re gonna test this. So how do they determine the publication bias, the publication status? They search online, they went to authors’ websites. Usually it was pretty straightforward to figure out what happened to particular projects. There might have been a published paper, there might have been a working paper. There were cases where they couldn’t find the paper. And in those cases, they sent a very respectful email to the author, just asking if they could get more information on the study. What were the results, etc. And they got really good response rates, they said. In some sub-set of cases they could get information from the test site, because they had some updates on the studies.
Skip to 3 minutes and 26 secondsSo I think there were only something like 12 or 17 studies that were just missing. Something like that, out of the 240. So for the bulk of the studies, they know what happened to them, but maybe 5 percent of the studies went missing. There’s this really important distinction they come up with. There’s obviously published studies. They know where those are. Among the unpublished, there’s a lot that are working papers and there’s a lot that are never written up. And that distinction is gonna turn out to be pretty important. In general, I think we all know, working papers are pretty visible to the profession. There’s very high profile working paper series in economics.
Skip to 4 minutes and 2 secondsIn other fields there may be less of a norm of working papers, but even in other fields, people often post working papers on their websites. But a paper that’s never written up, that doesn’t exist. Those results are basically invisible to the profession. They also code the strength of the results. This is a difficult exercise. They don’t get all the data and reanalyze it. They rely on descriptions of the strength of the results given by the authors. Basically the authors’ perception of the results. And they argue, well, that’s probably what matters. If I think the results are strong, then I’m gonna try to maybe do something with them relative to if they’re mixed or null. So kind of no effect here.
Skip to 4 minutes and 47 secondsMixed results, strong results, and then there’s this small number of missing. In certain cases they double check the author’s sort of coding by reading the paper themselves in a subset of cases, and 90 percent of the time, they agree in the assessment. Like, this is actually a significant, strong result. Or a mixed result, or a null result. So what’s the main finding? Null results are actually much less likely to be published. And this is more likely to be unpublished among well powered, interesting questions with experimental designs. Imagine among sort of lower quality studies. So this is what they find. This is the full 249. Among the null findings, 31 are unwritten.
Skip to 5 minutes and 32 secondsSo they got in touch with the author and they said, “Yeah, there was no effect. Never wrote it up.” Only 10 are published and then a bunch are kind of in the middle, with working papers. Among the mixed findings, 10 are unwritten, 40 are published. Among the strong findings almost all of them are written up. Only a few are unwritten. And most of them are already published. And then there’s some that are unpublished, they’re working papers or whatnot. But you know, the ratio for the nulls is basically 3 to 1. And the ratio for the strong is less than 1 to 10 in the other direction.
Skip to 6 minutes and 16 secondsJust a huge difference in the likelihood of being written up for these two sets of studies. And the mixed are kind of in the middle. So it makes sense. And again, there are these missing studies which basically are unpublished. That’s why they’re missing. So, they throw out the missings and they throw out – like there were a couple of book chapters. And they said, “Okay, these are all sort of comparable articles.” Like, these are supposed to be journal articles, and you get the ratios that you want here. The majority of strong studies by far, are published. The majority of null studies are unwritten. This is kind of troubling.
Skip to 6 minutes and 55 secondsThey push in their last paragraph or two for preregistration and pre-analysis plans. If studies were registered, perspective studies at least were registered, at least we’d know they existed. If you know they exist, you can at least do what Franco et al. did and reach out to people and ask what the results are. They had a pretty good success rate at filling in the gaps. And if I’m only interested in a particular literature, where maybe there really are only 10 or 15 studies, it would be pretty straightforward. I’d reach out to others in my research community and I would know that those papers existed or might have existed. They also support, at the end, what they call two-stage review.
Skip to 7 minutes and 29 secondsThe idea here is there could be some sort of conditional acceptance of papers even before results are revealed. And this is controversial, this sort of makes people scratch their heads a little bit, because it’s such a different process than what we’re used to in terms of publishing results. Plus so many things can sort of go wrong when you’re in the field. But conceptually it’s very appealing. The idea is to say, here’s my research design. Here’s my research question. Here’s the data I’m going to collect. And again, in the test studies, you kind of have to do that in your proposal anyway. And here’s how I’m gonna analyze the data. And I don’t know what the results are.
Skip to 8 minutes and 7 secondsBut you know, if this is a good research hypothesis, of interest to the scholarly community, they’ll care either way. So that’s the idea behind registered reports. But this is kind of pushing a little farther even than preregistration of studies. Because here anybody can preregister, you don’t need to go through the refereeing process and get any acceptance. You still go through the normal journal process. This is like – this changes the journal process quite a bit. And the sort of key concept here is what people are calling this In Principle Acceptance – IPA.
Skip to 8 minutes and 39 secondsThis notion that even before you collect your data, if you send this registered report to a journal, they would say, “You know what, when that comes out, we’re gonna publish it.”
"Publication Bias in the Social Sciences"
If publication bias and data mining are so common, what might this mean for studies that don’t produce positive results? A 2014 Science article written by Annie Franco, Neil Malhotra, and Gabor Simonovits tried to answer this question by reviewing over 200 high-quality studies whose publication status could be easily determined. What they found further confirmed the prevalence of publication bias in the social sciences. In order to combat this bias, Franco, Malhotra, and Simonovits pushed for journals to require authors to submit their questions and methods for review before data collection. We will learn about how social science journals have begun to formalize this process next week.
In this article, authors Annie Franco, Neil Malhotra, and Gabor Simonovits leverage Time-sharing Experiments in the Social Sciences (TESS) to find evidence of publication bias and identify when it occurs during research.
The authors state that publication bias occurs when “publication of study results is based on the direction or significance of the findings.” In general, there is a greater chance of a study being published if it has statistically significant results. This practice of selective reporting produces what is known as the “file drawer” problem where there is a tendency to store away statistically non-significant results in file drawers rather than publish them. Franco, Malhotra, and Simonovits write that “failure to publish appears to be most strongly related to the authors’ perceptions that negative or null results are uninteresting and not worthy of further analysis or publication.”
Researchers have tried to address publication bias in the past by “replicat[ing] a meta-analysis with and without unpublished literature,” and “solely examin[ing] published literature and rely[ing] on assumptions about the distribution of unpublished research.” Each of these methods have their limits, so the authors chose instead to “examine the publication outcomes of a cohort of studies.” In this case, they examined the outcomes of TESS, a research program that proposes survey-based experiments and “submits proposals to peer review and distributes grants on a competitive basis.”
Franco, Malhotra, and Simonovits compared the statistical results of TESS experiments that were published to those that were not. The advantages of this strategy are that there are:
- a known population of studies,
- full accounting of what is published or not,
- rigorous peer review for proposals with a quality threshold that must be met, and
- the same high-quality survey research firm conducting all experiments.
(Note: A concern with TESS is that it may not be completely representative of social science research.)
The analysis distinguished between two types of unpublished experiments: (1) those that were prepared for submission to a journal, and (2) those that were never written up in the first place.
The authors also considered “whether the results of each experiment are described as statistically significant by their authors,” as it can be difficult to know the exact intentions of each author. This was important because each author’s perceptions influence how they present their data to readers.
Studies were classified in 3 ways: (a) Strong – all or most hypotheses were supported; (b) null – all or most hypotheses were not supported; or (c) mixed – representing the remainder.
They found that null studies were far less likely to be published. This can be problematic for two reasons:
- Researchers may be wasting effort and resources conducting studies that have already been executed, but in which the treatments didn’t produce the desired result.
- If future studies obtain statistically significant results that are published, it could falsely suggest stronger effects.
To promote transparency, Franco, Malhotra, and Simonovits suggest a better understanding of the motivations of researchers who choose to pursue projects based on the expected results. They also propose the use of a “two-stage review (the first stage for the design and the second for the results), pre-analysis plans, and requirements to pre-register studies that should be complemented by incentives not to bury statistically non-significant results in file drawers. Creating high-status publication outlets for these studies could provide such incentives. And the movement toward open-access journals may provide space for such articles. Further, the pre-analysis plans and registries themselves should increase researcher access to null results. Alternatively, funding agencies could impose costs on investigators who do not write up the results of funded studies. Last, resources should be deployed for replications of published studies if they are unrepresentative of conducted studies and more likely to report large effects”
What do you think? Which of these proposed actions or incentives would be easiest to implement or be the most effective?
You can read the whole paper by clicking on the link in the SEE ALSO section at the bottom of this page.
Franco, Annie, Neil Malhotra, and Gabor Simonovits. 2014. “Publication Bias in the Social Sciences: Unlocking the File Drawer.” Science 345 (6203): 1502–5. doi:10.1126/science.1255484.
© Center for Effective Global Action