We’re going to talk about meta-analysis. The motivation behind meta-analysis and the motivation behind this general field of synthesis is it may be desirable to come up with some kind of punch-line or bottom-line from the literature. If you have a growing number of studies on a related topic, and say a policy maker needs to make a decision about whether to invest in that sort of intervention or understand its impact, you may find it useful to combine information across studies. So, there’s been famous social programs rolled out over the last 20 years, cash transfer programs. Taken together, what do these conditional cash transfer programs do? What sort of outcomes are affected?
If we looked across four or five different main sets of outcomes, education and health, whatever? That’s something the policymakers really want to understand if they’re considering rolling out a similar program. These are also of great interest to scholars. We want to have a sense across a number of different studies about what a particular parameter looks like? What a particular affect looks like. So we can understand the world. Those sorts of estimates are going to serve as our priors going into the next study that we design. And we want to make sure we have the most up to date information in thinking and understanding the world before we design our next research study.
And you know in many ways establishing a prior, maybe a non-zero prior is useful too. Maybe establishing that in your next research project you can actually reject the hypothesis that your effect is the same as the synthesis across previous studies could be useful. Maybe there’s some reason why your population differs. Maybe your program was designed a little differently, whatever. So there’s a lot of uses both for researchers and in policy and meta-analysis tools are really used in this context. So that’s why they’re developed. Even though their use is growing in the social sciences, these tools are used a lot less in the social sciences than they are in many other fields. In medical research they’re used more often.
In some branches of education, health and other sectors they’re used more than in say, economics or political science. You know part of the reason may be there’s so much desire among researchers to differentiate what they’re doing from other researchers who came before them. And talk up all the differences between their studies. That then people become hesitant to combine them or think of them in a sort of single estimate or single mean effect which is what you’re going to get out of a meta-analysis. So our salesmanship hinders, the accumulation of knowledge potentially, if we’re always highlighting all the nuances by which we differ from other studies.
Maybe partially for that reason these approaches haven’t been used as much as they could. But there is growing interest, and as one considers using them there’s a number of issues to keep in mind. So, the kind of most basic point that comes up again and again and one where there’s endless debate, if you read debates about meta-analysis, is what studies should be combined. Like how similar do you need to be, to be combined in a meta-analysis? Are the data similar enough? Are the are the outcome variables similar enough? Are the explanatory variables similar enough? Are the settings similar enough?
You know maybe a conditional cash transfer program in Latin America is just really different from one in Asia for some reason. Maybe we don’t want to combine those, or maybe we do. And that’s the sort of thing that I think our intellectual perspective and our deeper notion of institutions maybe could help us, or history may help us make that kind of decision about what is similar and what isn’t similar. But this is endlessly debated. People will go back and forth with any meta-analysis about inclusion criteria. How did you decide to include certain studies and not other studies?
And this is an area where in the social sciences we’re woefully sort of – These issues are examined way too little relative to say in biomedical research where there are standard protocols for how you search and which databases you search and which search terms you use and which studies you choose, and it’s much more organized than it is in the social sciences.
That said, once you start thinking about it, there are kind of standard ways forward. You could look at published studies. You can look at standard working paper series. I mean there are ways forward, but because we haven’t done it much, we haven’t worked out all the rules. Beyond the inherent conceptual similarity of the outcomes, and the explanatory variables, you know the question, there’s an issue about whether the data can be sort of standardized in a way that’s comparable. If you’re going to actually come up with some mean effect estimate, you need to be able to measure things in comparable terms, or else it may be impossible to combine estimates. So this is going to be a second question that’s important.
A third point is, really relates to the methods that are used. We may be concerned that a meta-analysis is combining effects estimated using different research designs. And we may think some of those research designs are more credible than other research designs. How do we weight these different estimates when they’re generated by different research designs? Which ones are more credible? Do we come up with a really strict standard that we only accept estimates from the most rigorous research designs? There’s kind of a continuum there, right. Like no study is perfect. Even a beautiful RCT may have had some attrition. Okay, so there may be differential attrition across the treatment and control group. So it’s no longer a perfect study.
Do we throw it out because there was 2% attrition? Oh, no, that’s pretty good. We’ll keep that one, and what about 4% attrition? What about 20% attrition? At what point do you sort of fall below the bar for being sufficiently high quality for inclusion? These are difficult questions. You know when we think back to – we’ve come back a million times to this labor literature on the impact of the minimum wage. Partially because this is an area where there have been meta-analyses. And there’s been discussion of publication bias and all that stuff. You know there, some of the early studies were cross-sectional. Others use panel data. You can imagine people using policy experiments. You can imagine different sources of variation.
How do we determine which of those studies to include, and which not to include? Now, maybe there are sensible standards, but we have to come up with them when we do a meta-analysis. And we have to be clear to the reader why we’re choosing those studies. The fear is once you allow researchers doing it – you know meta-analysis is supposed to give this systematic answer, where you pool the data and give this sort of answer that’s the combined wisdom of many scholars in the field.
But if I can be a little bit tricky with the inclusion criteria, I can start weeding out studies I don’t like, that I don’t get the answer that I like, and maybe get a biased meta-analysis estimate. And this is sort of what a lot of the debate in discussions about meta-analysis revolves around. Issue four, is publication bias a concern? I’m looking at a body of literature. We’ve talked a lot about publication bias. If the published literature or the available studies – and that body of literature is riddled with publication bias like is the meta-analysis even meaningful? If every single point estimate is 0.049, every single p-value is 0.049.
And you know, and we fail the P-curve test and all these other things. Should we even be looking at this literature? Or is it just so flawed that a meta-analysis is just going to give some false stamp of approval and credibility to a collection of studies that are all just data mined?