Nonprofits & NGOs

Why the Social Sector Needs the Scientific Method

A flawed study on deworming children—and new studies that expose its errors—reveal why activists and philanthropists alike need safeguards.

The book Zen and the Art of Motorcycle Maintenance, of all things, offers a critically important message for people who work in development and philanthropy. To wit: “The real purpose of the scientific method is to make sure nature hasn’t misled you into thinking you know something you actually don’t know.” 

Three new papers published today confirm this, by illustrating just how easily we can be misled by what we think we know, and just how much the power of the scientific method can safeguard us from continuing to be misled (and potentially investing significant time and effort on the wrong priorities). That’s because the three papers raise important questions about the practice of treating children for intestinal worms, which, in recent years, has become a darling of international development.

Deworming Programs Have Been “In”

Here’s the back-story. Worms infect people through contact with infected faeces. They live in people’s bodies (they can be a meter long!), eat their food, deprive them of nutrients, and make them lethargic and ill. And in 1999, two US economists, conducting a study in Western Kenya, found that “deworming” a number of school children improved their nutritional intake, reduced their incidence of anemia, and—by making them less ill and lethargic—increased their attendance at school and hence improved their exam results. The economists also claimed that attendance at schools where children did not receive treatment also increased—by 7.5 percent—because those children, living in the same area as the children who were treated, were not infected by worm eggs in feces in soil near their homes. (There are two main types of worms: soil-transmitted worms, and water-transmitted worms known as schistosomiasis or bilharzia. The Kenyan study was mainly of soil-transmitted worms but did pick up some schistosomiasis.)

Consequently, the Copenhagen Consensus made deworming one of its top recommendations. GiveWell named two organizations that focus on deworming in the top four on its list. And development economist Michael Kremer, a co-author of the 1999 Kenyan study, started an initiative called Deworm the World, which has treated 37 million children in four countries to date. 

The Scientific Method

Now, the scientific method involves several safeguards against being misled. One is isolating variables to reveal which one(s) matter. Maybe the speed with which a dropped object hits the ground depends on the height from which it’s dropped and gender of the person who drops it. So we experiment by having people of both genders drop identical objects from the same height, thus “isolating” gender as a variable and, when the objects hit the ground at the same time, showing that it doesn’t matter.

Another safeguard addresses bias by replicating an experiment elsewhere, and comparing and combining the answers. If we open our back-to-work program only to motivated people, then we don’t know whether their success getting jobs is due to the program (a “treatment effect”) or the unusual characteristics of the people we chose (a “selection effect”). The latter would create a selection bias. If we interview only the people in the program who stick it out to the end, we don’t hear from the people who quit because it was so arduous, so our user-experience data may suffer from survivor bias. These and other biases mislead us into thinking we know things we actually don’t know. Single studies may also be biased because they may unwittingly involve particularly unusual people or take place under unusual circumstances. They may also simply get freak results by chance. 

A third safeguard in the scientific method is repeating the analysis. In other words, checking the math.

The three papers, now available, used the scientific method to great effect. The Cochrane Collaboration is a global network of medical researchers who do “systematic reviews” and “meta-analyses” (it may well have saved your life at some point). In 2012, the Cochrane Collaboration wrote: “It is probably misleading to justify contemporary deworming programmes based on evidence of consistent benefit on nutrition, haemoglobin, school attendance or school performance.” Recent correspondence with the authors implies that they’ve not changed their minds. And today, the Cochrane Collaboration publishes its fourth systematic review of deworming. The group looked at all 45 studies within its scope and concluded that: “There is now substantial evidence that this [mass deworming treatment] does not improve nutritional status, haemoglobin, cognition, or school performance.”

In two additional studies published today, the London School of Hygiene and Tropical Medicine (LSHTM) simply re-analyze the Kenyan data. They found, if you excuse the pun, a can of worms: errors, missing data, misinterpretation of probabilities, and a high risk of various biases. The effects are huge: The claimed effect on school attendance among untreated children seems entirely due to “calculation errors” and effectively disappeared on re-analysis; the claimed effect on anemia statistically did the same.

We shouldn’t be surprised: That people make mistakes is hardly news. What’s impressive is that somebody took this important step of re-analyzing the data, caught the errors, and prevented us being misled by them. As Yale’s Dean Karlan and I noted when the 2012 Cochrane worm study published, this is exactly how science is supposed to work. 

The re-analysis papers raise three more subtle issues. First, the choice of analytical method matters (even if the data are complete and accurate). When looking at changes in school attendance, the economists used a method common in economics; the epidemiologists used a different method common in epidemiology and found that “the strength of evidence supporting the improvement was dependent on the analysis approach used”. There can only be one “correct” answer, and it’s not yet clear which method is misleading. 

Second is how rare re-analyses are. Open data to enable post-publication review is sexy and funded and increasingly common. But actually doing post-publication review is hard. It’s hard to fund—so hats off to 3ie who funded this one; it’s hard to do—the original authors sacrificed masses of time digging up old files for LSHTM to use; and it’s hard to get the results published—pre-publication peer review of LSHTM’s papers took about five months. 

Third is just how different this is from most impact research in the social sector. This is often unreported, or reported unclearly or incompletely, and only rarely are the raw data made available to enable inspection. I’ve argued before that most charities shouldn’t do impact evaluations (as has Dean, separately)—eradicating misleading biases is just too hard for non-specialists. But when they do, they should publish the full details and data. The scientific method requires it. And the social sector needs the scientific method.

To clarify, I’d like to make two follow-up points about this article (as of August 3, 2015): 

First, this article is about the scientific process, not about the deworming itself. It is not intended to take a position on deworming, nor to represent Dean Karlan’s view on the 2012 Cochrane study.

Second, it was written on the basis of the two LSHTM studies and the Cochrane paper, plus some (limited) correspondence with their authors, and published when they published. It therefore doesn’t respond to the subsequent discussion that has included a response from Kremer and Miguel here, to which the LSHTM authors responded here, and contributions from others. The general conversations have included some discussion of whether “errors”—in both the original Kremer and Miguel paper and LSHTM papers—are in fact errors (see here). Readers interested in the full debate might start with Vox’s piece or this post on Storify.

Perhaps this all shows just how difficult science can be.

Tracker Pixel for Entry
 
 

COMMENTS

  • BY Chester Davis

    ON July 24, 2015 09:50 AM

    Great article! I’ve long felt that nonprofit executives and social entrepreneurs could benefit from more scientific thinking about social policies and programs that they design, run or promote. There is probably a ton of under-utilized knowledge about what works or doesn’t work in what circumstances and lots of social innovations that aren’t being evaluated.

  • BY Caroline Fiennes

    ON July 28, 2015 01:45 PM

    Caroline Fiennes here, author of the article above. To be clear, this article was written on the basis of the two LSTHM studies and the Cochrane paper. Kremer and Miguel have since published a response to the studies, a summary (by them) of which is here: http://emiguel.econ.berkeley.edu/assets/miguel_research/63/Deworming-summary_Kremer-Miguel_2015-07-24-CLEAN.pdf

  • BY Liana Downey

    ON July 31, 2015 12:07 PM

    Thanks Caroline for raising such an important issue and bringing attention to the importance of really doing one’s research before you roll out large scale initiatives. Your article raises a number of incredibly important points. I wish the example you shared were the only one, but of course there are many examples of well-intentioned groups ‘jumping on the bandwagon’ of one approach or another, at enormous cost and investment, to only find, years later, that the outcome they are seeking to drive is not shifting.

    Having said that—I am very nervous about the impact of your message on many organizations. I have heard time and time again that measuring is ‘all too hard’ from nonprofit leaders (, and despite the best efforts of funders, and the shift that is starting to happen, the reality is that there are hundreds of thousands of non-profits in America alone who do not measure their impact in any meaningful way.

    My experience is that these organizations are doing so NOT because they are dismissive of the scientific method. In fact, it is precisely because they are so convinced that the ONLY meaningful research is a peer-review control study, that they do nothing as Megan Golden and I wrote about in http://www.ssireview.org/blog/entry/just_do_it). Of course you are right, such research is the gold standard, and well it should be. But suggesting that nonprofits wait around for someone else to do the research and challenge themselves only to monitor implementation (as per your previous articles) has me breaking out in sweats.

    I’ve witnessed the enormous power of encouraging organizations to take on responsibility for asking the question — is what we are doing working for our clients?

    When organizations start to monitor outcomes (not process), they start to ask much smarter questions. They start to ask not just “how many attendees did you have at your after school program”, but “what was the change in (obesity), (vocabulary), (school attendance) across different program areas? When they do that, they notice differences. These observed differences then lead to meaningful conversations — which drives critical innovation and improvement. While of course in the day-to-day running of a program it is difficult to account for a range of control variables with a great degree of rigor, powerful insights are still generated. It also shifts the dynamic of staff on the ground to actually drive innovation and accountability.

    Another example is how this can play out in the medical world. Of course hospitals implement only clinically proven and safe interventions. They don’t feel obliged to run control studies at every turn. But even so, if they do not hold themselves accountable for monitoring outcomes, they put people’s lives at risk.

    While at McKinsey, I worked with hospital management teams to collate and review comparative data on outcomes across a range of standardized operations. What we saw were meaningful variations in life/death outcomes. This information of course drove much deeper investigation, uncovering both positive (and negative) innovations by surgeons and nurses. By encouraging management to track and understand these differences, all kinds of innovations and improvements can be made which literally save people’s lives. Sometimes it is implementation (hand washing), sometimes it is innovation (checklists a la Atul Gawande). But if an organization does as you suggest and holds itself accountable only for monitoring implementation, there is little incentive to innovate and improve.

    You are right on so many fronts. Funders should think about getting behind testing interventions in a rigorous way. Information should be made open and shared. But I believe organizations should absolutely be encouraged and supported to check whether their work is making a difference in people’s lives.

  • BY Caroline Fiennes

    ON August 2, 2015 10:09 AM

    Thanks.
    The key is in your concluding sentence: most operating charities should CHECK that their work is making a difference, i.e., check that the change in outcomes pre- and post- their interventions are in line with rigorous evidence. That is precisely what happens in hospitals, the good example that you cite.

    I really think that most operating charities should not ‘measure their impact’ because they have neither the skills nor the money to deal with confounding variables, i.e., to ascertain whether changes they are seeing are due to them, or something else, or random chance.

    On what then can charities do if nobody has yet provided rigorous evidence, that certainly is a problem, but two thinks they can do are:
    - use relevant evidence: almost nothing is completely innovative and new
    - ensure that they’re giving ‘beneficiaries’ what they want.
    I’ve written more about this at http://www.giving-evidence.com/m&e
    It’s a longer topic: Dean Karlan is currently co-authoring a book about it: see here: http://www.ssireview.org/blog/entry/measuring_impact_isnt_for_everyone

Leave a Comment

 
 
 
 
 

Please enter the word you see in the image below:

 

SSIR reserves the right to remove comments it deems offensive or inappropriate.