Rigorous performance measurement has become the Holy Grail of the social sector: It’s a lofty goal that inspires a noble yet meandering journey—a journey that features many false paths, and very few signs that might tell seekers whether they are making any progress. And the seekers face conflicting incentives that make the journey still more difficult. (Grantees and funders, in other words, don’t even agree on what the object of their quest looks like.) There are notable exceptions, of course. But our overall assessment of the situation as it stands today is a variation on an old joke: Nonprofits pretend to measure impact; funders pretend to believe them.

In conducting due diligence for selection of the Henry R. Kravis Prize in Leadership, we have observed how rare it is for organizations to obtain substantive data on whether their intervention actually works. More than 75 percent of the 800-plus nonprofits that we have researched over the past nine years do not have impact data that one could deem reliable. In our view, too many nonprofits still fail to develop rigorous performance measures, and too many funders not only fail to demand clear measurement but also decline to pay for it.

(Illustration by Mikel Jaso) 

When nonprofits do attempt to measure performance, they tend to become preoccupied with metrics that demonstrate how busy their staff members are (the number of activities conducted, the number of people reached, and so on), and they give short shrift to more relevant metrics that indicate whether their programs are actually improving people’s lives. In general, there is a tendency to ignore the wisdom of Einstein’s dictum: “Not everything that counts can be counted, and not everything that can be counted counts.”

In a social sector that lacks a market-like mechanism to separate the wheat of effective intervention from the chaff of mere good intentions, performance measurement is essential. Below are three broad principles that nonprofit leaders should follow in gauging whether and how their organization is making an impact.

Be Brave

Nonprofit leaders commonly claim that they don’t have the money to invest in impact evaluations—that they must devote their scarce resources to programs that help beneficiaries. In our view, if a nonprofit can’t afford to conduct an impact evaluation, then it’s not ready to scale up in a significant way: If you can’t demonstrate that your logic model works, why should anyone fund you? We find complaints about an excessive focus on evaluation to be largely a smoke screen.

Several Kravis Prize winners provide notable exceptions to the general pattern of measurement avoidance. These organizations stand out for their willingness to accept the costs—and to embrace the risks—that impact evaluation often entails. Since 1975 (just three years after its founding), BRAC has been a trailblazer with respect to conducting rigorous evaluations. The organization “invested very early in creating internal evaluation capacity, and that has continued to be a priority over the past 30 years,” says Kravis Prize recipient Sir Fazle Hasan Abed, founder and chairman of BRAC. The organization maintains a research website that features more than 1,000 publications related to the evaluation of BRAC programs, and Abed notes that BRAC’s research and evaluation department has become “a particularly strong place to prepare for senior leadership.” (The current vice chair of BRAC, for instance, came from that department.)

Abed and his team also understand that nonprofits should use performance measurement not as a one-off exercise to appease funders, but as an essential management tool. “Many at BRAC feel that there is too much evaluation work being imposed by funders, and not enough originating with nonprofits that want to improve. Everyone should be asking: Impact evaluation for what end? Funders and their grantees need to shun gimmicks,” says Susan Davis, president and CEO of BRAC USA.

BRAC has a distinctively “failure-focused” approach to using evaluation as a management tool. By regularly identifying points of difficulty, the organization is able to adjust its programs continuously. In 1979, for instance, BRAC launched an oral rehydration program to treat diarrhea, a leading cause of death in children under the age of five. During its initial phase, the program was not meeting its goals, and an evaluation identified a host of challenges. Health workers, for instance, weren’t using the program methods at home with their own children—a clear sign of a more systemic problem. BRAC brought in an anthropologist who discovered that there was an underlying gender issue: BRAC hadn’t persuaded men in the program’s target households to use the treatment. The evaluation process also led BRAC to develop an incentive payment structure for health workers who promoted the oral rehydration therapy. “The program became enormously successful mainly due to continuous monitoring and evaluation of program effectiveness,” says Abed.

Be Rigorous

Scholars, practitioners, and others have made enormous progress in analyzing the conceptual underpinning of rigorous performance measurement. (For a good summary of that development, see “Advancing Evaluation Practices in Philanthropy,” a series of articles published as a special supplement to Stanford Social Innovation Review.) The key concepts are as straightforward as they are powerful: Start with a mission-focused theory of change. Outline a logic model that shows a clear connection between your intervention and your desired outcome. Recognize that an analysis of costs and benefits lies at the core of any viable measurement methodology. And leverage the lesson that it took decades for development economists to discover about measuring approaches to poverty alleviation: Prioritize micro over macro.

Randomized controlled trials (RCTs) are the gold standard of evaluation methodologies. They not only incorporate all of those key attributes but also address the counterfactual—by demonstrating what happens in the absence of an intervention. Yet many nonprofit leaders are reluctant to embrace randomized evaluations. The RCT process, they say, is expensive and time-consuming; it can compromise a nonprofit’s ability to control its own programs; and it can turn an unwelcome spotlight on instances of failure. (We acknowledge, by the way, that RCTs aren’t the appropriate evaluation tool for every kind of program. Our main point here is that use of RCTs is much less common that it should be.)

Pratham, another Kravis Prize recipient, has overcome those challenges, and today it uses randomized evaluations to transform its operations for the better. Over the past 12 years, the organization has completed 11 RCTs. “Randomized controlled trials have been tremendously helpful in allowing us to zoom in on the strategy that works, and to change the model when it didn’t work,” says Madhav Chavan, founder and CEO of Pratham. By way of example, he cites a Pratham program in India that uses volunteers to help teach children to read. “We had previously thought that volunteers by themselves tutoring kids after school would make a difference. But when you look at the change in learning profiles for the kids, fully relying on volunteers does not work,” Chavan explains. “We had suspected this was true, but once there was data, we acted. We changed our model completely based on these studies.”

In addition to enabling improvements in program strategy, RCTs can bring great benefits across a nonprofit by instilling a “measurement mindset,” as Pratham leaders call it. “The RCT process is expensive, but the value is enormous because it builds internal capacity,” Chavan argues. “After we started doing the RCTs, our entire organization started understanding data much better, and we acquired down the line a better understanding of how to think of impact.”

Chavan and his team have successfully navigated the control-related challenges that come with the RCT process. He recalls working with evaluators from the Abdul Latif Jameel Poverty Action Lab (J-PAL) at MIT: “When we were first approached by J-PAL, an evaluator looked me sternly in the eye and said, ‘Madhav, if you say yes, remember that I will take control of your organization!’ When we broke for lunch, one of my colleagues said to me, ‘Do we really want to get into this? Who is this guy? The university people, what do they know?’ But we said, ‘Let’s take the risk.’ You have to trust them. You have to let the evaluators take over to prevent data contamination. If you start by doubting your evaluators, you are finished before you start.”

Whether a nonprofit undertakes an RCT or another type of evaluation, it should consider inviting a third party to conduct the evaluation externally. Kravis Prize recipient Vicky Colbert, founder and executive director of Escuela Nueva, notes that her organization has benefited significantly from 12 external evaluations that it has undergone over the past two decades. “As Escuela Nueva has grown over the years, data and analysis from external sources have been extremely helpful in paving the way for us to scale,” Colbert observes.

To cite one example: A research initiative led by Patrick McEwan, a faculty member at the Stanford University School of Education, evaluated the effectiveness of an Escuela Nueva (EN) program in Colombia. McEwan found that the program had positive and statistically significant effect on Spanish and mathematics achievement among third-grade students and on Spanish achievement among fifth-grade students. Yet his research also uncovered troubling variations in program implementation from one school to another: Less than one-half of schools in the program were using official EN textbooks, and one-third of those schools did not have libraries. Colbert and her team used that information to adapt the EN program. “Stakeholders, both internal and external, are less likely to dispute [program] changes when they are a response to real data,” she says. “The improvements that we made, coupled with the many evaluations that demonstrated that our approach really worked, gave us momentum to scale.”

Be Strategic

Effective evaluation enables an organization to exert influence across the nonprofit sector and to generate momentum for a particular type of intervention. The 2014 Kravis Prize recipient, Helen Keller International (HKI), treats the evaluation process as a way not only to refine its own programs, but also to inform and shape the work of others. In just the past year, CEO Kathy Spahn notes, HKI programs have been the subject of 32 evaluations (many of them conducted by external evaluators). “Whether the data demonstrates success or highlights challenges, we share lessons learned directly with our partners,” Spahn explains. “We also publish our results in peer-reviewed journals and other publications.” Andrew Fisher, executive director of the Lavelle Fund for the Blind, a long-time funder of HKI, praises the organization’s sector-wide impact: “We have made some of the largest grants in our history to HKI in part because HKI is a leader in its field, known not only for effective implementation but also for the rigorous evaluations that it disseminates throughout the eye-care field.” HKI, for instance, was a pioneer in using PEC (post-event coverage) surveys to verify the reach of its Vitamin A supplementation program and to determine how best to target hard-to-reach populations. The organization then shared its findings with partner groups such as UNICEF and the Micronutrient Initiative.

In another example, HKI’s Homestead Food Production program in Bangladesh underwent both an internal evaluation and an external evaluation conducted by the International Food Policy Research Institute. On the basis of those evaluations, the program became a featured best practice in a high-profile review called “Millions Fed: Proven Successes in Agricultural Development.” This review, funded by the Bill & Melinda Gates Foundation and developed in response to the 2008 food crisis, greatly heightened interest in Homestead Food Production and led others to adopt similar approaches.

Leaders at BRAC have made a similar commitment to sharing lessons that emerge from their evaluation efforts. After an impact evaluation revealed that a BRAC microfinance program was not reaching the poorest people in its target population, for example, BRAC developed a new program called Targeting the Ultra-Poor. From the start, BRAC approached evaluation of this new program with the goal of sharing lessons with the nonprofit sector as a whole. To showcase program results, the organization helped form a broad community of practice that includes the Consultative Group to Assist the Poor, the Ford Foundation, the London School of Economics, and Innovations for Poverty Action. “BRAC has compelling evidence that not only guides our own work but also influences others to invest in what works to eradicate extreme poverty,” Abed says.

Support SSIR’s coverage of cross-sector solutions to global challenges. 
Help us further the reach of innovative ideas. Donate today.

Read more stories by Kim Jonker & William F. Meehan, III.