In 2009, the New Teacher Project (TNTP)—a nonprofit organization that promotes effective teaching—released a report titled “The Widget Effect.” The report advocated reforms that would differentiate teacher effectiveness across multiple dimensions and then use that information in teacher development and staffing decisions. The Walton Family Foundation, which we represent, helped to fund the report. Soon after the release of “The Widget Effect,” in June 2009, US Secretary of Education Arne Duncan cited it in a speech. “These policies were created over the past century to protect the rights of teachers, but they have produced an industrial factory model of education that treats all teachers like interchangeable widgets. A recent report from the New Teacher Project found that almost all teachers are rated the same. Who in their right mind really believes that? We need to work together to change this,” Duncan said. That public citation of the report by a major policymaker gave us clear and objective evidence that TNTP was having an impact.
Our work with TNTP and other groups has shown us that it’s possible to measure the effectiveness of advocacy grantees in a reliable way. A formal evaluation strategy that incorporates a logic model and an array of performance measures, we have learned, can be just as relevant to advocacy projects as it is to direct service programs.
In recent years, many philanthropic organizations have broadened the focus of their giving to encompass public policy advocacy as well as traditional service delivery projects. Alongside this trend toward greater investment in advocacy, a debate has emerged over whether, when, and how to evaluate the performance of advocacy organizations. Some contributors to the debate argue that advocacy occurs in such a fluid and complex environment that it isn’t susceptible to evaluation that relies primarily on objective sources of evidence. Steve Teles and Mark Schmitt, for example, argue that because of the “peculiar features of politics, few if any best practices can be identified through the sophisticated methods that have been developed to evaluate the delivery of services. Advocacy evaluation should be seen, therefore, as a form of trained judgment—a craft requiring judgment and tacit knowledge—rather than as a scientific method.” (See “The Elusive Craft of Evaluating Advocacy,” in the summer 2011 issue of SSIR.) Advocacy evaluation, from this point of view, is more an art than it is a science.
Our experience has led us to a different conclusion. We believe that rigorous, evidence-based evaluation of advocacy efforts is feasible and that tools are available to help organizations conduct evaluation of that kind. Far from being a “soft” activity, the work of gauging whether and how advocacy groups have achieved their goals is—or at least can be—a “hard” discipline that draws on independently verifiable data.
An Education in Advocacy
Since 2007, the Walton Family Foundation has made three large grants to TNTP to help place high-quality teachers in high-need settings and to advocate for changes in educational human resources policy. It was under the first of those grants that TNTP produced “The Widget Effect.” At the time, we had not yet developed a formal framework for evaluating the impact of advocacy efforts by grantees. But when we reviewed TNTP’s reporting at the end of the grant, we saw a familiar pattern emerge. We recognized that it would have been possible, before funding the grant, to work with TNTP to map out a logic model that would connect its advocacy outputs (the activities that it conducted) to its outcomes (the anticipated changes in the world). Similarly, it would have been possible to establish performance metrics that rely on objective sources of evidence.
When the foundation made a second grant to TNTP in 2009, we developed a clear set of expectations for the activities that TNTP would conduct and the results that its leaders expected to achieve. In collaboration with those leaders, we established specific performance measures that communicated both a vision of success and a model for measuring that success. By the time the foundation made a third grant in 2011, TNTP was able to articulate a chain of cause and effect that tied its advocacy outputs to its expected outcomes. With each of these grants, we learned progressively more about how to establish performance measures for advocacy work.
Every project—whether it's a traditional service delivery program or an advocacy campaign—must start with a plan.
The logic model used for the 2011 grant was fairly straightforward: Within a certain timeframe, TNTP would create and release a nonpartisan research report on a teacher quality challenge. (The title of this second report was “The Irreplaceables.”) In the report, TNTP would present evidence about the state of a particular policy area and then draw conclusions on the basis of that evidence. In the short term, we expected the report to raise awareness of certain policy issues among the broader public, as measured by media coverage of the study and its findings (Outcome 1). Then, over time, we expected to see evidence that people were actively engaging with the report, as measured by the number of times that visitors downloaded the document from the TNTP website (Outcome 2). As before, TNTP issued its report and held briefings with top officials in the Obama administration. And, as before, Duncan publicly cited the report. Those activities helped to demonstrate that TNTP had achieved the outcomes we sought.
The story of our work with TNTP shows that it’s both feasible and beneficial for a grantee to establish a theory of change that details how specific activities (creating and disseminating a policy report) will create an intended impact (influencing policymaker opinion and, ultimately, influencing policy choices). It also shows that it’s possible to base evaluation of advocacy organizations on independently verifiable evidence.
Our work with advocacy grantees has also yielded an important lesson about the need for flexibility. When people at TNTP started work on the Widget Effect project, for example, they hoped to answer a basic question: “Why are poorly performing teachers so rarely replaced?” As they followed the data, members of the TNTP team found that asking that question revealed a more systemic problem with traditional forms of teacher evaluation. That discovery led them to change the focus of the project, the intended audience for the report, and ultimately the policy changes that they would recommend. The ability to change a plan is critical to ensuring that grantees can be responsive to shifts in their environment. For that reason, a performance measurement system should include easy and unobtrusive ways for grantees to revise their plans mid-course. In most instances, however, the core logic model will remain the same; only the details of a plan will undergo change.
A Model for Evaluation
At the Walton Family Foundation, we have evaluated the performance of scores of advocacy grantees. Our approach reflects an interest in balancing a desire to collect the best available evidence with the need to limit disruption of our grantees’ work. In nearly all cases, we have seen that traditional social-scientific evaluation techniques—development of a logic model, measurement of outputs and outcomes, use of objective data sources—are well suited to the evaluation of investments in advocacy.
We start by requiring each advocacy project to begin with an explicit logic model that aligns what a grantee intends to do with the results that the grantee intends to achieve. Setting benchmarks of this kind creates a basis for rigorous evaluation at the end of the grant. Every project, we believe—whether it’s a traditional service delivery program or an advocacy campaign—must start with a plan. The plan may evolve, but every project must include a clear statement both of how it will effect change (outputs) and of what that change will look like (outcomes).
Next, grantees translate that statement of outputs and outcomes into a set of performance measures that must contain five essential pieces of information.
- Who will achieve a given change or accomplish a given task?
- What will change or be accomplished through that effort?
- When will the change or accomplishment occur?
- How much change will occur, or what will the level of accomplishment be?
- How do we know that the change or the accomplishment has occurred?
Through a formal metric amendment process, our model builds in enough flexibility to allow grantees to alter their goals midcourse. In addition, we include a narrative component in our reporting requirements so that grantees can capture qualitative information about contextual factors. But we prioritize data that we collect through the performance measurement process because that information is the most reliable. The narrative material helps us interpret findings, but it does not directly influence those findings.
At the end of a grant period, an advocacy grantee must collect and report on the array of evidence that will allow us to evaluate its performance against its logic model. Grantee reports, in our view, are essential both to the formal evaluation process and to a grantee’s internal performance assessment. These reports contribute to the grantee’s ongoing process of measuring how well it is implementing a given theory of change. In that way, evaluation not only supports grantee accountability and helps funders to increase the return on their investment, but also enables organizational learning.