Measurement & Evaluation

Measuring Impact Isn’t for Everyone

Collecting data to demonstrate your organization’s impact is great to do when you should, wasteful when you should not.

The Value of Strategic Planning & Evaluation The Value of Strategic Planning & Evaluation In this ongoing series of essays, practitioners, consultants, and academics explore the value of strategy and evaluation, as well as the limits and downsides of these practices.

Many organizations claim to implement programs and policies that benefit the world’s poor without evidence of impact beyond anecdotes. As an example, microcredit organizations touted their programs as a solution to global poverty for years; with credit, people could start new lives and new businesses, increase their income, and send their kids off to school. Yet recent randomized evaluations of microcredit programs from around the world (including Mexico, Morocco, India, the Philippines, Bosnia and Herzegovina, Mongolia, and Ethiopia) suggest that the picture is more mixed. For example, an evaluation in Mexico led by Innovations for Poverty Action found that women who received loans were generally happier and weathered hard times better, but their overall financial standing in terms of income or investment in new businesses did not change.

Ultimately, impact evaluations should be comparative. It isn’t just about whether something works, but also how to do the most good with scarce resources. Take school attendance for example. Anecdotal claims suggest that providing basic necessities, such as uniforms and scholarships, makes it more likely that children will attend school. But for the money, it turns out the that giving kids deworming pills is 28 times more effective than school uniforms and 56 times more effective than scholarships in increasing school attendance. However, getting credible information on impact is not easy—many organizations struggle to measure the results of their work, and often use methods and data that paint unreliable pictures of program success. We applaud the focus on impact, when feasible. But sometimes impact simply is not measurable in a credible way, and yet people (organizations, or perhaps their donors) push to measure it anyhow.

Even though randomized impact evaluations deliver invaluable evidence about which programs to implement, these evaluations are not possible for all programs. There are two main cases when organizations should not seek evidence about impact:

  1. When that piece of evidence already exists
  2. When generating evidence on impact is simply impossible to do well

Stated more broadly: We should conduct an impact evaluation only when the evaluation plan will narrow a knowledge gap.

For example, the first requirement fails (luckily) in the case of vaccines. Is there a knowledge gap on the efficacy of the measles vaccine? Perhaps we’re ignorant about the medical literature, but we believe the answer is no. Thus, an NGO vaccinating children need not run a randomized trial to measure the impact of the measles vaccine; it violates the principle of “equipoise,” which argues that researchers should run experiments only when there is real uncertainty over impacts. In fact, it would be an unethical expenditure, as the money could go to pay for more vaccines!

The second requirement falls apart when rigorous evidence simply isn’t feasible or appropriate to collect. Sometimes this is the result of the question under examination (macroeconomics-level policies such as trade agreements, for instance); sometimes it is about the particular setting, size, stage, or scope of the activity. Yet this requirement is not as restrictive as many think. Often we find that settings deemed implausible are indeed plausible with a bit of creativity, and many advances over the past 10 years came from learning new approaches to conducting randomized trials on social science questions. But in many cases it still is not viable to answer the impact question well.

Unfortunately, many organizations still collect data on impact, even when it is not possible or feasible. An insistent focus on measuring impact in these cases can be costly, both in terms of money spent collecting that data (which could have better uses) and time (management’s focus on bad data vs. running their program).

Instead of this wasteful data collection, organizations should work to build appropriately-sized data-collection strategies and systems that demonstrate accountability to funders and provide decision makers with timely and actionable operational data.

For a forthcoming book, called The Goldilocks Problem, we developed a set of principles that all organizations—regardless of their ability to assess impact—can use to build strong systems of data collection. We call these principles the CART—credible, actionable, responsible, and transportable data collection.

  • Credible: Collect only data that accurately reflect what they are intended to measure. At a larger scale, credibility means accurately measuring the impact of a program through rigorous evaluation. At a smaller scale, credible data collection also refers to appropriateness and accuracy of chosen indicators.
  • Actionable: Collect only the data that your organization is going to use. To make data actionable, ask if you can use the information to change the course of action at your organization—if not, do not collect it. Put simply: If all possible findings lead to the same decision, it is a waste of time and money to collect that information.
  • Responsible: Match data collection with the systems and resources your organization has to collect it. Think about the resources you have. It is tempting to collect as much information as possible, but if overreaching will compromise the quality of data you collect and your ability to analyze it, the data will not help anyone.
  • Transportable: Apply what you learn to other programs and contexts—either your own program in future years or in other locations, or those of other organizations working on similar problems. For transportability, you need to know something about why a program works, and be open and transparent about sharing learning with others.

Researchers and organizations are working hard to widen the set of programs that we can evaluate rigorously. Even though learning about impact may become more feasible over time, in every case, organizations will better serve their mission by focusing on cost-effective, decision-driven data collection, rather than a rigid focus on impact.

Tracker Pixel for Entry


  • Barbara Robinson's avatar

    BY Barbara Robinson

    ON April 24, 2014 12:05 PM

    But how do you deal with funding agencies (public or private) that absolutely insist on showing measurable impacts?

  • Tesssema Berihun's avatar

    BY Tesssema Berihun

    ON April 27, 2014 11:59 AM

    Well I am glad that I read your article, and honestly I enjoyed it. But I have some reservations on the claim. Let me explain my self by talking the details of what I mean.

    From the general narration in this particular article my conclusion is we need to be careful in selecting impact assessment methods than concluding “impact measurement is not for every one”. You mentioned the case in micro-credit, school attendance and measles vaccine, and you stated that programs claimed that they impacted their beneficiaries while not in the first but the measles` impact assessment it was a waste of time. I will talk about measles in a bit, but let me talk more on the first two. Programs` impact assessment claim a positive result while the reality is not is a design problem. So, we need to assess why those tools failed. It does`t imply that Impact assessment as program management component is not important for things like these.

    Where as the measles vaccine issue is a little bit different . First of,  let me make my stand clear. I believe that programs should do impact assessment for preventive programs including Measles vaccine. why? Because each program works in a different community profile and each different profile affects the impact level of preventive programs. For example the impact level of measles immunization coverage of 80% is different in two communities when   one has childhood malnutrition problem while the other is not. This is an example, the list can go and go.

    You also mentioned that impact assessment is sometimes is impossible. Well, my opinion for this is one can always do impact assessment.  Why? Because impact is a continuous process and selecting which level of impact can we carry out is a design issue; it does`t imply that impact assessment is impossible.

    In summary impact assessment is an essential part of program element which we as program implementers are lagging behind either by not doing it or doing it the wrong way, but there is no way we should ignore it at any cost. With the notion of the economic model of program implementation each program should have an impact to the system it is working ; otherwise that program should not exist.

  • Charles Lor's avatar

    BY Charles Lor

    ON May 14, 2014 07:45 PM

    As Gugerty and Karlan open up with evidence on microfinance from RCTs, I am reminded that a new World Bank Policy paper on microfinance in Bangladesh used panel data to show effects after 20 years.

    CART is the right message. And very few projects have a CART-compliant RCT potential. Fortunately, econometricians, anthropologists, epidemiologists have a broader toolbox. Rather than setting aside impact, isn’t it time that we re-open the toolbox and tell practitioners what is good enough insead of relentlessly promoting a putative gold standard?

  • Terence Beney's avatar

    BY Terence Beney

    ON September 10, 2015 03:55 AM

    There is a further consideration that emphasizes the validity of the assertion that impact evaluation isn’t for all programs. A lot of poor impact evaluation work is produced and gets included in a growing ‘pool of evidence’ which we have no mechanisms to control the quality of. Scientific publication has the peer review system that, for all its faults, offers a fairly effective way for sifting the good from bad. Evaluation is grey literature with limited quality control. It needs some order, and questionable ‘impact’ results won’t help us.

    Addressing the knowledge gap is good rule of thumb to constrain the enthusiastic but misguided overproduction of ‘impact’ studies. We probably need a few more.

Leave a Comment


Please enter the word you see in the image below:


SSIR reserves the right to remove comments it deems offensive or inappropriate.