Measurement & Evaluation

Seven Deadly Sins of Impact Evaluation

Seven obstacles to making good decisions about impact evaluations and how to avoid them.

Impact evaluations—typically, third-party studies that seek to prove a program model’s effectiveness—seem to be all the rage in social sector circles these days. Maybe in part that’s because the process seems so straightforward: Just commission one when the time is right, and, when all goes well, proudly show off your “stamp of approval.” You’ll soon receive the resources you need to grow your organization and to influence all the other nonprofits in your field.

The problem is that it’s rarely that simple in practice. Consider one youth-serving organization we know, which undertook an impact evaluation—at great expense and with high visibility to its funders—only to have the process cut short when the evaluators discovered that the organization’s numerous sites were implementing its program model in wildly different ways. Did that nonprofit have growth potential? Yes. But had its leaders been conducting regular internal measurement, they probably would have realized that their organization was not yet mature enough for the rigors of an impact evaluation.

Pitfalls like this one crop up again and again in our conversations with organizations. In an effort to equip nonprofit leaders with the knowledge they need to make good decisions about impact evaluations, here is our list of the “seven deadly sins” we see nonprofits commit most often:

1. Immaturity. Per the anecdote above, don’t pursue impact evaluation until you are crystal clear about your organization’s target population, approach, and outcomes, and have internal data that shows you are consistently reaching that population, delivering intended services, and achieving intended outcomes. If you’re not sure that’s happening and want some help, you’re a good candidate for a formative evaluation where, for much less time and money, third-party evaluators will take a “peek under the hood” and suggest how you can improve your model to get it ready for impact evaluation.

2. Deference. Some nonprofit leaders assume that the evaluator should dictate what the evaluation entails—either because the evaluator is the expert, or out of concern that they not be seen as influencing the study. But the truth is, unless you articulate up front what decisions you hope to make coming out of the evaluation (or conversely, what questions you would like to answer), the evaluation will probably not be very useful. No work of any kind should begin until there is clear agreement on what the study will and will not address.

3. Narrowness. Impact evaluations are often designed to answer one question: Do beneficiaries achieve greater outcomes than similar individuals not receiving services? But far too few studies are adequately designed to answer the critical follow-on question: Why or why not? So the nonprofit is left with little to no guidance about what to replicate (if the evaluation is positive) or what to improve (if it wasn’t). If you pursue an impact evaluation, make sure evaluators gather data on the inputs (context, staff, beneficiaries, etc.) and outputs (services accessed) of your program, and that they explore qualitative methods (focus groups, in-depth interviews, etc.) that can help interpret the quantitative data they collect.

4. Isolation. Most nonprofits assume an impact evaluation has only two parties: themselves and the evaluators. But creating an evaluation advisory committee in advance of an impact evaluation is a good idea. Often these are volunteer committees comprised of prestigious experts in the nonprofit’s field (other evaluators, academics, practitioners, policymakers, etc.) who can advise on the types of thorny issues this post describes. They might meet just three times—to review the evaluation’s design, interim results, and conclusions—but their advice can be critical to ensuring a useful evaluation.

5. Myopia. Those new to impact evaluation assume they will receive a “pass” or “fail” mark at the end. In truth, nearly all evaluations result in something in between. If your organization doesn’t get an A+, don’t assume that you’ve failed. Instead, before getting started, ensure you develop a shared understanding—among staff, and with funders—of why you are undertaking the evaluation and what the possible outcomes might be. Funders in particular need to recognize the bravery it takes to submit one’s organization to outside scrutiny, and not automatically walk away from organizations that receive a B or C, so long as they have a serious plan in place to improve.

6. Finality. Many nonprofit leaders seem to think that an impact evaluation is a one-time exercise. In truth, the most successful nonprofits see measurement—including impact evaluations—as an ongoing exercise in trying to get better, not a “one and done” deal. They constantly measure because they are constantly testing their models in new sites, new contexts, and with adaptations to improve quality or lower cost.

7. Self-exclusion. Some nonprofit leaders equate impact evaluations with randomized control trials and assume that if a comparison group doesn't naturally exist for their work, then impact evaluation is not for them. In truth, there has been a significant amount of innovation in measuring the impact of complex interventions such as advocacy, neighborhood revitalization, and capacity building. While impact often cannot be “proven” in the specific, statistical way it can with randomized control trials, evaluations in such environments can nonetheless result in significant insights about how well an organization’s programs are working and how they can be improved. If the organization is ready for an impact evaluation on all other fronts, it’s worth exploring the possibility.

Which of these “seven sins” have you personally experienced or seen? How have you gotten around these obstacles?

Read more stories by Matthew Forti.

Tracker Pixel for Entry


  • Ermias Beyene Mehari's avatar

    BY Ermias Beyene Mehari

    ON February 23, 2012 05:19 AM

    These are really great points to consider before launching impact evealuation. However, when it comes to humanitarian organisations evaluation excercise, most of the time the point in focus are efficiency, effecrtiveness, coordination, relvance, timeliness and sustinability.

    This approach dosen’t consider the direct impact of the projects on the lives and livelihoods of the individual households and or the vulnerable targeted population.

    On the other hand, mostly Organisations advertise and or head hunt an evlauator who most probabaly was an ex- INGO staff with lots of networking. In many cases the eveluation reports expose small portion of the weakness of the organisation and it means the evaluator will secure a good working relationship with the aganecy concerning future evlaution and other contracts.  I suggest if an organisation wants to do an ebvalaution, it should advertise and select on merits and or the dondors should be the one who have to select the evaluator by omiting the organisation to self -identify an evalution consultant.

    The fact that the evalaution report can determine future funding opportunities, aganecy staff panic whenever a dodnor identified evalautor is sent to them.

    Most evalaution reports dosen’t tell much on the weakness of the organisation and if they do so it is always wriiten in a very polite language with no much description and identification of where and why the project failed.

    In general, INGOs lack accountability to the donor and or to the fund raisers as well as to the community and the government. If they embress honest and open evalutions, which involve government, benficaries, civil society and acdamics in their evaluations, lots of change can happen through lessons learnt and dissiminated to all actors.



  • This is a nice article, but the first link (“formative evaluation”) goes to David Hunter’s stuff, which is tuh-herrible.

    I worked for an organization that tried—twice—to work with a disciple of David Hunter, but the process was a miserable, useless failure, both times.  The guy just didn’t know what he was doing.  At all.  Then I went to David Hunter’s site, browsed around for a while, and the reasons became became apparent—David Hunter’s “system” (really systems) is an inconsistent, convoluted, and outdated mess.

    His whole ship really needs to tighten up—you could even say it needs a “formative evaluation.”

  • BY Caroline Fiennes

    ON February 26, 2012 10:53 AM

    Two things strike me from this article.
    First, that it’s so sad that non-profits often think the purpose of evaluation (and monitoring and all that) is to impress funders. That’s like children learning for an exam in order to get an exam result rather than to learn anything. The primary purpose of evaluation and monitoring (and a whole bunch of other things in non-profits) should be to LEARN how to better serve beneficiaries - irrespective of whoever happens to be looking on.
    In fact, it’s kind of dopey to do M&E (or most other things) just to impress funders because in general the money doesn’t follow great results anyway: the funding market isn’t that rational. My experience as a charity CEO was that evaluation data made zero difference to attracting funding - literally zero: what made the difference was knowing people & getting invited to the right dinner parties.

    Second, for the learning process, this question should be up in lights: “Do beneficiaries achieve greater outcomes than similar individuals not receiving services?” - that is, does the product/service provided by the non-profit actually add anything, or are any observed changes just a cohort effect. (‘The children in our reading programme were taller at the end than they were at the beginning’: well, yes, because children grow, irrespective of reading programmes.) I’m surprised the article doesn’t go into that in more detail. There’s masses of research/thinking in medicine & the social sciences about dealing with that question (i.e., (what would have happened anyway: the counterfactual) -  ‘controlling’ for other influences. Non-profits often claim that ‘we did XYZ and then something-or-other happened’ but provide zero evidence that the something-or-other happened was caused by the XYZ.

  • Matthew Forti's avatar

    BY Matthew Forti

    ON February 29, 2012 07:02 AM

    Ermias raises an important point about ‘evaluator independence’.  When those funding impact evaluations want the ‘whole truth’, ensuring the evaluator is free to publish all of the findings (not just cherry-picking the good ones) is key.  Beyond the funder, the social sector in general benefits when all of a study’s results are published, peer-reviewed, and the underlying data made available to others.

    Caroline also reinforces the important point that measurement is first and foremost about learning, and secondly that the number one purpose of an impact evaluation is to determine whether (and I would add, why) beneficiaries achieve greater outcomes as a result of the intervention.  The methods for ‘controlling for other influences’ vary from randomized control trial (the most rigorous, but also only relevant for those programs that can be evaluated in this way) to qualitative methods such as elimination of alternative causes (the least rigorous, but sometimes the most that can be done given the intervention at hand).

  • BY Andrea Schneider

    ON March 13, 2012 12:02 PM

    The evaluation paradigm is not congruent with new and innovative methods, such as design thinking in the public sector.  It’s a disconnect. I agree about evaluation being perceived as a “grading” system, “will I get an A or F?”  We haven’t made evaluation friendly or integrated it into our program or service design. 
    I think we have to re-design evaluation to be useful to the stakeholders. I have had great success in co-designing and producing evaluation with the people involved. They help us identify the nuances of wins, not normally identified and help us make evaluation relevant.
    Telling those stories go a long way to helping people see what they’ve accomplished, the challenges along the way.  We can re-frame evaluation as a positive tool.  These days we are surrounded with so much data, it’s hard to sort it out and by the time we do, it’s over.
    I will say that immediate expectations of change is ridiculous.  When looking for more serious and sustainable change it has proven quite useful to test 6 & 12 months after implementation.
    We all know change takes time.
    Caroline makes many good points.  I will also add that the “happy sheets” we use so often to “evaluate” an event of workshop are rarely of great use.  Invariably we will hear that participants liked the time between workshops to talk to others, will tell us the room was too hot or cold and occasionally let give us some new ideas.  We rely on them way too much.
    I think the more we can help people tell their own story in legitimate ways the more ownership we will achieve.  I found when I integrate evaluation into program/service planning, with the stakeholder’s, I’m more likely to be helpful to their successes.

  • Rex S. Green, Ph.D.'s avatar

    BY Rex S. Green, Ph.D.

    ON April 12, 2012 11:48 AM

    This article covers in a comprehensive way the pitfalls of performing research-like evaluations of human service programs.  What concerns me is that no one seems to be aware of the research literature offering alternatives to the “standard” approaches.  After warning folks of the pitfalls, would it not be appropriate to at least mention some of the articles about directly measuring the productivity of services?  In an effort to move this discussion forward, here is a list of my contributions to this literature.  I can only hope more people will give some other evaluation approaches a try.
    Green, R. S. (2001).  Improving service quality by linking processes to outcomes.  In M. Hernandez & S. Hodges (Eds.), Developing outcome strategies in children’s mental health (pp. 221-238).  Baltimore, MD: Brookes Publishing Co.

    Green, R. S. (2003).  Assessing the productivity of human service programs.  Evaluation and Program Planning, 26(1), 21-27.

    Green, R. S., Ellis, P. T., & Lee, S. S.  (2005).  A city initiative to improve the quality of life for urban youth: How evaluation contributed to effective social programming.  Evaluation and Program Planning, 28(1), 83-94.

    Green, R. S. (2005).  Assessment of service productivity in applied settings: Comparisons with pre- and post-status assessments of client outcome.  Evaluation and Program Planning, 28(2), 139-150.

    Green, R. S. (2005).  Closing the gap in evaluation technology for outcomes monitoring.  Psychiatric Services, 56(5), 611-612.

    Green, R. S., & Ellis, P. T. (2007).  California Group Home Foster Care Performance: Linking Structure and Process to Outcome.  Evaluation and Program Planning, 30(3), 307-317.

Leave a Comment


Please enter the word you see in the image below:


SSIR reserves the right to remove comments it deems offensive or inappropriate.
Measurement & Evaluation

Practice Safe Stats! A PSA

Featuring Jake Porway 1

Listen to Jake Porway, founder and executive director of DataKind, talk about how to “practice safe stats”—that is, how to create data visualizations that are both accurate and inspiring.