Ensuring That “Scaling What Works” Actually Works

For “scaling what works” to actually work, we need a new and improved version that addresses two fundamental constraints.

If you follow the nonprofit press closely, I imagine that over the past year you’ve been treated to a healthy dose of the “scaling what works” rhetoric. Perhaps it’s gone something like this: “We know what works, and if we could just steer funds to those organizations with superior results, society would be much better off.” The message seems to be that our heads are in the sand, and if we could pull out long enough to see the light, all our problems would be solved.

To be clear, nonprofits that invest in measuring, improving, and ultimately proving their impact should be rewarded, and they should grow. But in order for “scaling what works” to actually work, we need a new and improved version that addresses two fundamental constraints:

“Past performance is no guarantee of future success”
The SEC mandates that mutual fund companies affix this label to their products; nonprofits ought to do the same, for two reasons. First, proving that a model works at one site, in one context, and at one point in time does not guarantee that it will work again. Consider that one recent, high profile effort to replicate the positive results of 67 high-quality pharmaceutical studies achieved full success only 20 percent of the time. With the many-fold higher complexity of replicating social interventions in real-life settings, it is hard to imagine a much better success rate in our line of work.

Second, even if nonprofits demonstrate success in multiple sites or settings, few are truly clear about whether their models are replicable. Most evaluation studies devote little, if any, attention to underlying organizational factors (such as culture and leader characteristics) and contextual factors (such as regulatory climate and the presence of high-capacity partners) that play a role in the model’s success. In the absence of understanding the conditions under which a model worked, organizations or funders often require replicators to follow the original model with full fidelity, potentially precluding important adaptations and improvements that could increase the odds of success.

What are we to do? Evaluators and evaluated organizations should devote much more effort to studying and reporting on these underlying factors. As a field, this means recognizing and accepting the value of qualitative assessments in teasing out the essence of a model’s success. It also means tackling, head-on, complex questions around replication, including what characteristics best position outside organizations to successfully replicate a model, what adaptations these organizations should be able to make, and whether there is merit in replicating practices to existing providers, instead of supplanting them.

We may learn, for example, that a youth mentoring organization in Alabama is succeeding primarily because its well-connected and charismatic executive director has secured high-quality corporate mentors, state government grants that enable more programming, and a devoted staff that works double time for the program’s youth. The best path to scale may therefore involve internal replication to new counties in Alabama where the executive director’s relationships are strong, or external replication to organizations in other states with similarly well-connected and charismatic leaders. (For more on replication, see Bridgespan’s “Getting Replication Right”).

Missing the forest for the trees
The “what works” in “scaling what works” is increasingly defined as interventions that achieve statistical proof of their impact, often through well-designed experimental evaluations known as randomized control trials (RCTs). While RCTs are practical in many contexts, requiring them to earn a “what works” imprimatur leaves out many interventions, such as advocacy and neighborhood revitalization. And even in fields like human services, the primacy of RCT evaluation favors organizations that seek to move short-term indicators (such as improved attendance or getting a job) and penalizes those that aim for longer-term change with clients exposed to highly dynamic environments.

A better approach recognizes and accepts the value of a wider range of evaluative methods. If a skilled and independent evaluator concludes, after deep quantitative and qualitative research, that a certain approach to neighborhood revitalization is having an impact and is replicable, funders should be much more eager than they are today to scale this kind of “what works.” Why? Because complex interventions that achieve longer-term, root-cause change will ultimately have more impact in our society than more straightforward interventions that achieve shorter-term, fragmented change.

It will take a full-team effort to ensure that “scaling what works” actually works. Funders must insist on replicability assessments within the evaluations they fund and consider a wider range of ways to “prove” impact. Organizations with evidence-based models, and those looking to adopt them, must learn to welcome a thorough assessment of those underlying factors that greatly affect the success of replication.

What is your experience in “scaling what works”? Which funders and organizations are taking a pragmatic approach and achieving great results?

And since this is a blog post about improving performance, I hope you’ll consider helping me improve mine: what are the topics within performance measurement that you would be most interested in reading about over the next six months? Please leave a comment!

Read more stories by Matthew Forti.

Tracker Pixel for Entry


  • BY Heather Peeler

    ON January 27, 2012 07:56 AM


    I read your post with great interest and identified many parallels to what we are learning and sharing through Scaling What Works, a multiyear learning initiative of Grantmakers for Effective Organizations.

    It’s true that as societal challenges grow and available resources to address those challenges shrink, the field of philanthropy is focusing on making the most of every dollar by investing in what we know gets better results. Yet, as you aptly pointed out, it is not enough to just know what works; we also need to know how and for whom it works, where, why and under what conditions it works and, if effective, how it can be sustained and scaled in order to have stronger impact.

    Grantmakers and the nonprofits they support certainly can benefit from using a wider range of evaluative methods and data to understand what works and grow the impact of effective programs. To that end, a growing number of grantmakers are using evaluation as a learning tool. What are they learning about how to support grantees to engage in meaningful evaluation and pursue multiple ways to reach scale?

    1. Don’t assume one size fits all. There are different types of questions and methods depending on what you seek to learn, start first by understanding what questions need to be answered and then decide what information or methodologies are appropriate and what is feasible to implement.  As a result, the “right-size” evaluation efforts will match the size and reach of the grant.
    2. Focus on more than just proof. Grants and other forms of philanthropy often target complex problems or systems that do not lend themselves to easy answers and most grants are too small or focused to address these issues comprehensively. Sometimes it can be more helpful to use evaluation to measure progress, not just to prove whether something succeeded or failed. 
    3. Embrace failure. As we strive to identify what works, it is as important to understand what doesn’t work and to distinguish the “essential” ingredients from those that are open to adaption.  A grantmaker can put a failed project to good use by capturing lessons about what happened, why the project fell short of expectations and how to achieve better results in the future. Evaluating as you go also makes it possible to change course along the way if something is not working.
    4. Provide the right support. Investing in the capacity of grantees to use evaluation for learning means that nonprofits will have much-needed resources to evaluate their work and put what is learned into action. Unrestricted, flexible and reliable funding as well as capacity building support are critical.

    You can find some great examples of how grantmakers are approaching evaluation to bolster nonprofits’ efforts to learn, improve and, ultimately, scale their impact at

  • Matthew Forti's avatar

    BY Matthew Forti

    ON January 30, 2012 06:21 PM

    Heather - these are great comments, and I would urge readers to follow the powerful ‘Scaing What Works’ series at GEO.  Grantmakers have an important role to play in ensuring scaling what works actually works.

  • Before any organization can scale anything that works, scaling what works, two questions must be answered:

    (1)  The Whether? Question:  Whether or not the policy, program, system, or strategy works?

    (2)  The How? Question:  How does the policy, program, system, or strategy work?

    Unfortunately, answering the Whether? Question (which can be done using a variety of evaluative approaches) does not answer the How? question.

    Moreover, the “How” may involve a lot of tacit (as opposed to explicit) knowledge.  And tacit knowledge — particularly the tacit knowledge of leadership, which may have been very important to the origininal “working” — is much much more difficult to transfer or scale up.

    See “On the responsibility of helping others to learn the Tacit Knowledge of Leadership,” at (May 2007).

  • Saras Chung's avatar

    BY Saras Chung

    ON February 22, 2012 12:43 PM

    I think you have hit upon a very interesting topic that not many people have addressed in regards to maintaining fidelity at what cost? With the increased emphasis on replication of evidence-based practices, it would be interesting to see what the thresholds are for each program. Evaluation measures are therefore so much more important from start to finish…there are just too many variables that could confound a program.

Leave a Comment


Please enter the word you see in the image below:


SSIR reserves the right to remove comments it deems offensive or inappropriate.