Measurement & Evaluation

The Problem With The Impact Genome Project

Would the Impact Genome Project have predicted the impact of Martin Luther King?

The Impact Genome Project plans to use "big data and genomic analysis to measure, predict, and improve the outcomes of social programs." It's a worthy goal: All of us involved in social change would love the power to predict outcomes, particularly when we're on the hook to funders for those results.

What may get lost, however, are the programs that matter most in the long run.

Sure, we can use data from prior bed-net distribution campaigns to predict whether a new bed-net campaign will successfully increase the number of bed net users (and thus decrease instances of malaria). We may even be able to use data from bed-net distribution campaigns to predict, albeit with less certainty, the outcome of a condom distribution campaign. Given how much government and philanthropic funding goes toward aid programs, that's important information.

But such programs represent only a small subset of the changes our world needs. Sometimes, yes, we need more people to use bed nets. Other times, we need laws and courts that protect journalists, appropriate police investigation and prosecution of gender-based crimes, an adequate system of contract enforcement so entrepreneurs can flourish, or an end to the internment of refugees in camps—a massive waste of human capital and a blatant violation of refugees' rights to work.

Can the Impact Genome Project predict the success of initiatives designed to achieve these ends? Would the Impact Genome Project have predicted Dr. Martin Luther King Jr.'s success or Harvey Milk's influence on marriage equality?

Even with bed nets, we probably don't see distribution by aid agencies as a sustainable model, long-term. Instead, we want policies that make bed nets a local imperative: government incentives for local production of bed nets, public health advertising encouraging bed net use, and a strong science curriculum in schools so that students—the next generation of parents, citizens, and leaders—understand why bed nets matter.

Even for bed nets, we ultimately want a new policy framework. Without it, social change is unsustainable.

If the Impact Genome Project can accurately predict programs that yield policy change, anywhere in the world, I'll be thrilled. I'll also be surprised. The factors that influence success are highly varied and often have little or nothing to do with the policy in question.

My organization’s Thai affiliate, Asylum Access Thailand, for example, began work in 2010 aimed at getting a historic temporary asylum law on the books in Thailand. If passed, the law—technically an amendment to Thailand’s existing Immigration Act—would mark the first time Thailand granted any legal status to refugees. Could the Impact Genome Project have predicted that, just as the draft law was ready for introduction to Parliament in 2014, one party would boycott elections and leave the country without a functioning government? Or could it have predicted that Asylum Access Tanzania's 2012 letter to the UN Special Rapporteur on Migrants, which initially soured our relationship with the Tanzanian government, would ultimately catalyze improvements in the government's approach to refugees when a new Minister of Home Affairs was appointed?

More likely (in my admittedly non-data-driven view), the Impact Genome Project will reinforce many funders' preferences for programs that yield immediate and easily measurable returns.

The Impact Genome Project's creators, Jason Saul and Nolan Gasser, liken their efforts to those used in the financial world. They're right—and that's the problem. Lenders prefer those whose credit-worthiness is easily measurable—even those with a history of nonpayment—over foreigners, young people, and others whose credit-worthiness may be excellent but whose credit history is unknown. Financial markets prefer companies with recurring quarterly profits over those that incur losses as they build a foundation for greater gains long-term. The Impact Genome Project similarly incentivizes a focus on short-term results.

At the Skoll World Forum last month, Saul said he hopes that the Impact Genome Project will lead to a "social capital market," where outcomes are bought and sold. The problem is, the most sustainable and effective social change won't sell well.

If we're serious about effective social change, we need more incentives for funders to support efforts to shift policy frameworks. By developing and promoting the Impact Genome Project, Saul and Gasser create a countervailing incentive, pushing funders toward direct aid models. And by pushing for a social capital market, they replicate, rather than correct, the inefficiencies of existing capital markets.

Tracker Pixel for Entry


  • BY George McCully

    ON May 29, 2014 01:31 PM

    Excellent piece, Emily. Historians know that predictions of the future are rarely, and then only accidentally, correct, because history and historical causation is far more complex than the predictors’ minds and methods. The probability of error increases with the length of time-frame, width of geographic, economic, political, et al., and depth of cultural psychological, personal, et al., references. Thus only the simplest predictions can be made reliably on the basis of the simplest data, which limits the utility of impact prediction to relatively trivial subjects.

  • nikolaos moropoulos's avatar

    BY nikolaos moropoulos

    ON May 30, 2014 03:32 AM

    It appears that the Impact Genome Project is using a paradigm that is restricted in its reach and application. The fascination with big data and the appeal of “universatility” is unquestionable. However, social policies, intiatives, enterprises deal with people and are put together by people. It is a complex world out there, and the fact that we can use big data to model retail buyers’ behavior does not mean that we can do the same with social policies, especially the ones that break new ground. When a successful policy actually innovates and improves the social fabric and the life of the people in it, the big data cannot help. If this were the case, why is “innovation” as a process and outcome today a big unknown?

  • BY Andrew Means

    ON May 30, 2014 06:48 AM

    I appreciate your concern regarding the Impact Genome Project. I think you’re right, it won’t predict the kind of policy change that you are talking about. But the vast majority of policy implementation is done through funding interventions. We currently don’t do a very good job of that. If the job of philanthropy is to solve problems, as I believe it is, we have a very poor feedback system because we don’t seem to know which interventions are working at solving the problems we face.

    Would the Impact Genome Project have predicted MLK? Of course not. But could data driven strategies have helped MLK improve the way he rallied people to his cause, served the poor, and addressed the problems of his community? Yes.

    We shouldn’t make data a fetish but we also shouldn’t push it away just because it might become one for a while. We need to use data for what it is good at and understand its limitations.

    Our world would be a better place if we got even 25% better at putting money into effective interventions. Data can help us do that. It’s not the magic bullet and it can’t solve everything, but it is a powerful tool we must not run away from. For its flaws, the Impact Genome Project is at least moving us in that direction.

  • BY George McCully

    ON May 30, 2014 06:57 AM

    Good and useful comment, Andrew. Is it currently instituted that the IGP’s methodology also be applied to evaluating the impacts of its own initiatives—in the accuracy of its own predictions, the influences of those predictions on program and project designs, and the impacts of those resulting programs and projects in achieving their goals?

  • BY Andrew Means

    ON May 30, 2014 07:10 AM

    Very good question George. My understanding of the IGP, and I’m familiar with it but not working on developing it, is that it actually doesn’t predict outcomes per se, it benchmarks them. Basically it says that programs like yours tend to have these outcomes? The times I have heard the team talk about the project they are always talking about benchmarking. But again, I’m a third party watching this.

    So it seems, methodologically speaking, that what they are doing is dissecting types of interventions, taking the research that does exist on those interventions, and extrapolating. Think of it like Amazon’s People Like You Also Bought feature but more like Programs Like Yours Have These Outcomes.

    So its similar to prediction but I’d actually place it in a different category.

    Philosophically though, I completely agree that firms interested in partnering with funders and nonprofits in this kind of way should absolutely hold their feet to the fire. If I’m a hedge fund manager and my predictions prove to be wrong year in and year out, hopefully I don’t get anymore money (in a perfect world!). The same should be true here. If an organization claims to build accurate models but can’t deliver we should really question the value they bring to the market.

  • BY George McCully

    ON May 30, 2014 07:23 AM

    Thanks, Andrew—very illuminating and helpful.

  • Robert H.'s avatar

    BY Robert H.

    ON May 30, 2014 11:55 AM

    This sounds like another good idea, that has been met with little investigation. It’s questionable to see how people outside of the field have found a solution that brilliant minds dedicated to this work have been unable to do. This is not unique idea, the problem is that it has never been executed properly.

  • Clint Dempsey's avatar

    BY Clint Dempsey

    ON May 30, 2014 12:15 PM

    Excellent, well constructed piece. Metal Gear Solid’s extensive Genome data is highly recommended reading and/or viewing. Fascinating stuff.

  • BY Emily E. Arnold-Fernandez

    ON May 30, 2014 04:37 PM

    Thanks for all the comments! 

    Andrew, I think you’re absolutely right when you say, “We need to use data for what it is good at and understand its limitations.” IGP certainly has some important uses; what concerns me most is the idea that IGP should be the foundation of a social capital market, because this implies that all social change interventions can be accurately benchmarked and compared. 

    On a more micro level, I don’t have a problem with a tool that says “Programs Like Yours Have These Outcomes,” but I think it’s critical there be transparency about what programs are included in the “Like Yours” bucket.  Among other things, such transparency creates space for useful dialogue between funders and NGOs about how a particular type of intervention could multiply its effectiveness with only a small adjustment.  Think of the lightbulb: The IGP would’ve told funders it was an ineffective means of lighting a room until Edison coiled the filament and eliminated oxygen to make it last 100x longer.

    I haven’t seen such transparency for IGP yet.  (My understanding is that in the private sector, the contents of the “Like Yours” bucket typically aren’t revealed for fear someone else will reverse-engineer the algorithm.)  If IGP will allow access to details about the programs included in benchmarking categories, that may be its most valuable contribution to the sector: It’s hard for an NGO to move from doing Intervention A to doing a totally different Intervention B, but if Intervention A is 10x more effective in, say, grades 5-7 than in grades 6-8, that’s an easy and important change.

  • Jason Saul's avatar

    BY Jason Saul

    ON June 2, 2014 08:27 AM

    A robust discussion, and a good one.  As the creator of the Impact Genome Project and the CEO of Mission Measurement, I thought I could clarify a few points that might advance the conversation:
    • The purpose of the Impact Genome Project (IGP) is to systematically measure program efficacy.  Efficacy is defined as the likelihood that a participant will obtain a positive outcome through the program.  The IGP is not intended to predict future events or prescribe strategies for social change.  Rather it is intended to codify the research-base so that we can understand which factors and program design elements are likely to produce higher efficacy rates.  Instead of reinventing measurement for every program in the world, we can now systematically learn, improve and predict outcomes with greater confidence. 
    • The IGP measures program efficacy based on two sources of data: 1) the individual program’s historical track-record of performance; and 2) a genomic analysis of key programmatic factors that are proven to correlate to efficacy. 
    • It is true, as Emily and others point out, that measuring policy change or other more systemic and ethereal outcomes is very difficult.  And it is also true that programs that aim at achieving these goals are worthy and important.  However, in my experience evaluating social impact programs for governments, foundations and donors, the majority of programs aim at more tangible, direct-service outcomes that can be normalized and should be measured. 
    • Even those programs that seek to influence policy, change laws or create systemic incentives, typically have reasonable outcomes that can be measured along the way: mobilizing stakeholders, raising visibility of an issue, raising funds to support an issue, etc.  We may not have been able to predict MLK day, but we certainly could have predicted the reach and impact of many civil rights programs. 
    • Finally, I would add that the importance of social science research is only growing.  We can’t keep guessing at what works – spraying money around and then hoping that we get some outcomes.  Evidence-based initiatives like the IGP thoughtfully and rigorously seek to document what we’ve learned and use data to help funders and government make more informed decisions.  Our best bet is not to resist measurement or accountability, but rather to find smart ways to make measurement more valuable and accessible to practitioners. 

  • BY George McCully

    ON June 2, 2014 09:00 AM

    Thanks, Jason, for this very helpful comment. Of course efforts to find systematic ways to measure, and on that basis to project future, impacts in human services interventions have been hardy perennials in philanthropy and public programs for decades—without notable success, owing to the variety, complexity and elusiveness of the subjects, and the relative simplicity of minds and methods addressing them. What we have learned generally is that programs have many unique features, which often influence outcomes, rendering systematic generalizations so far impractical. 

    I’m sure we all wish your initiative all the best. It would help us to know that your project is fully appreciative of the historic and intrinsic difficulties you are facing, and what precisely is so different in this attempt that holds significantly greater promise of success.

    Two phrases in your comment give me pause: 1) that “program efficacy…defined as the likelihood that a participant will obtain a positive outcome”—efficacy is not a likelihood of success, it is the success itself; and 2) “that the importance of social science research is only growing.” “Importance” is very different from “practical success”; the efficacy we all seek is the latter, not the former, and the latter as informed by social science has not generally succeeded, which is why your innovations might be extremely valuable.

    But as I say, thanks for your help and we wish you all the best.

  • BY Emily E. Arnold-Fernandez

    ON June 2, 2014 09:41 AM

    Jason, thanks for weighing in!  I agree with what I hear you and with Andrew saying, that we should use data where and when we can to achieve better use of our philanthropic resources. 

    My concern is that philanthropists will be led to invest more in direct-service outcomes precisely because those are most easily measured and normalized, and correspondingly to neglect the types of systemic change that could reduce the need for direct-service interventions. The more we engage the philanthropic community in the discussion about where and when data is useful, the better we’ll all understand its limitations (although we may not always agree on what those limitations are!)  I look forward to continuing to follow the development of the IGP and to seeing where that discussion takes us.

    I’m curious, though, Jason (per my earlier comment): Does IGP plan to be transparent about the details of programs that are included in the genomic analysis, and/or about what IGP selected as “key programmatic factors” and why?  Or will that be proprietary?

  • BY Jason Saul

    ON June 5, 2014 03:12 PM

    George, I agree that measuring social impact is difficult.  But it is not impossible, and it can be systematized.  I have spent the past 20 years focused on just that problem alone.  But what we have learned, and what the research does tell us, is the following: 1) there are common outcomes across all of the millions of social programs that exist -there are only 1100 different types of social programs based on NPC codes, and only 132 common types of outcomes based on our research. (  So we know we can standardize outcomes.  2)  We can also standardize, and codify, the factors the produce those outcomes.  The IGP codifies programmatic factors – precisely the “unique features” that you allude to – and correlates those factors to efficacy based on empirical data. Most measurement efforts have tried to use strategies to predict efficacy, and as you point out, these strategies vary widely and can rarely be reproduced with fidelity.  Our approach is different – we don’t prescribe strategies.  We use factor modeling – quantifying a vast number of factors that statistically correlate to outcomes and then testing the congruence between certain program strategies and that fact base.  It’s sophisticated regression modeling.  So while yes, systematic generalizations of programs have proven impractical, systematic generalizations of underlying explanatory variables for certain outcomes have proven quite practical, and surprisingly consistent over time.  For example, in the field of positive youth development, we know that certain factors – related to dosage, frequency, duration, school integration, curricular content, parental involvement, etc. absolutely correlate to higher efficacy rates in programs.  We don’t need to reinvent that formula every time we come up with a new PYD initiative.  We can codify the research to date, and use that data as directionally informative for future programming and for predictive analytics. 

    And Emily, while there may be tendencies among some to measure the most tangible outcomes and ignore the intangible ones, I do not believe that is a reason to hold ourselves back from advancing the field of measurement.  These are two different issues: one is about creating better data and the other is about how people use that data. I also believe that many philanthropists and legislators- certainly the ones I know – have great thoughtfulness and appreciation for the full range of social outcomes, and I don’t expect that making it easy to measure certain outcomes will deter them from focusing on the ones that they believe are most important to advancing social change. 

    Finally, one of the things I think will be most beneficial, is that the IGP will be fully transparent about the factors that drive efficacy, and when we run “genomic imprints” or scorecards of different programs currently, we disclose the factors on which the analysis is based, and the underlying research that supports the inclusion of those factors. 

    Thanks again for the discussion. We’re still developing the project and I’d be happy for you to learn about our progress by visiting Mission Measurement’s website (, following us on twitter (@missionmeasure), or subscribing to our email newsletter.

  • These academic discussions about what efficacy is (or isn’t) and what MM’s “algorithms” can (or cannot) predict are all well and good.  But my read is that the REAL Problem with the Impact Genome Project is (a) much more basic and technical in nature and that (b) the consequences of being wrong as a result of its technical limitations, are severe. I am in agreement with comments above that raise issues of transparency and accountability, and funders’ need to “hold the feet” of firms like MM “to the fire.”

    My specific concern is that MM may very well lack credible outcome data on which its so-called predictive modeling is based. In one article, Saul states that MM has “carefully documented more than 78,000 outcome data points (78,369 to be exact) from more than 5,800 social programs.”  Impressive numbers, but how were these defined and determined, and by whom?  In other words, where did they come from why do you—and therefore, should we—believe in them? Have they been independently verified and supported? Can they ever be? Until the credibility and integrity of MM data—and in particular, that which they call “outcome data” – are independently and publicly demonstrated, we have to assume that this is all built upon a house of cards. In other words, to the extent that IGP outcome data (i.e., that which indicates a program or initiative is “successful” or “not successful”) are based on flawed or low-quality studies conducted by various and sundry (and unknown) researchers and evaluators, then they ought not be included in an aggregate used to tell us something as important as the how to leverage and optimize the impact of social change.

    I’m therefore not at all confident that a model that predicts something as inconsequential and frivolous as the next song a listener may enjoy serves credibly—and ethically—given the severe consequences of being wrong. If Pandora chooses, for example, a Bob Dylan tune for me because its algorithm predicts that I will like it (I don’t) because I am now listening to Johnny Cash (related because they both play guitar?), I may get annoyed, but there was no real risk of or consequences for “being wrong.”  So I hit the “thumbs down” button and go on to the next “personalized” selection. No big deal. If, on the other hand, a model purports to assess the efficacy of a social change initiative that requires substantial, precious fiscal and human resources to affect the lives and welfare of say, children, then it better damn well be based on good and independently verifiable data.

    The MM pitch so far has been able to skirt questions of data integrity, or say, “believability.”  And let’s be honest:  MM is – or, eventually will be—selling something. Caveat emptor.

Leave a Comment


Please enter the word you see in the image below:


SSIR reserves the right to remove comments it deems offensive or inappropriate.