Randomistas: How radical researchers are changing our world

Andrew Leigh

288 pages, Yale University Press, 2018

Andrew Leigh is a distinguished economist, member of the Australian Parliament, and—as evident in his latest book, Randomistas: How radical researchers are changing our world—a grand storyteller. He begins with the gripping tale of how ship surgeon James Lind, in 1747, discovered that citrus fruit could prevent thousands of seafaring men from dying of scurvy. Leigh engages the reader with his lively descriptions of an impressive series of triumphs accomplished by a single research methodology: randomized controlled trials (RCTs).

In what one British reviewer has called “a jolly history of the experimental in economics and social science,” Leigh lauds the contribution that randomized trials have made to the understanding of social programs, “one coin toss at a time.” He seeks to rescue the “randomistas”—the nickname given to the radical researchers—who use data experimentation to overturn conventional wisdom from the opprobrium directed at their research by people like economist Sir Angus Deaton, who has criticized the randomistas’ monotheism as myopic and a product of magical thinking.

Leigh and his fellow randomistas promote this one method as superior to all other means of obtaining reliable knowledge; he calls the alternatives to RCTs “opinions, prejudices, anecdotes and weak data,” and other ways of “privileging hunches over facts.” He believes RCTs will keep us from funding and spreading what doesn’t work—such as Scared Straight, a now discredited program that attempted to reduce youth crime by using frightening obedience tactics. RCTs also correct for the “contamination” that comes with advocacy. Judith Gueron, president for thirty years of MDRC, the eminent social policy research organization, has testified that RCTs “[kept] us honest and help[ed] us avoid the pitfalls of advocacy research.”

But it is not only the objectivity of randomized experiments that attracts supporters. Randomistas also believe that the nation’s lack of social progress can be attributed to the failure to understand social problems through the RCT method. In a Brookings Institution panel launching Leigh’s book just prior to its publication, the widely recognized randomista Jon Baron of the Arnold Foundation contended that everything from stagnant educational achievement to unchanging poverty rates are the consequence of interventions not relying on RCTs.

The notion that RCTs are the only route to better outcomes is overly simplistic. A more realistic view of the world holds that there are many more reasons that these social indicators remain flat. But one major contributor is that the only evidence that is trusted by key decision makers comes from RCTs, which are unsuited to assessing the complex interventions most likely to work. Jeff Liebman of Harvard’s John F. Kennedy School of Government, and former deputy director of the Office of Management and Budget, suggests that because problems are hard, and human beings and their social environments are complex, “our current mechanisms for funding and evaluating social programs do not produce a culture of continuous learning and improvement, nor do they generate opportunities for fundamental reengineering of systems to produce better results.” There is not enough focus, he explained, “on systems or outcomes across an entire population in a community.”

This explanation suggests an insight not found in Leigh’s book: There is a steep price to pay when we are seduced by the simplicity of RCTs and the sense of certainty that they bring.

When the RCT is the primary tool, we learn about only a small fraction of what we need to understand to significantly improve outcomes. Yet some of the most promising efforts to achieve better outcomes in everything from health and education, to crime control and economic development are not candidates for randomized evaluations. They are not simple programs with clearly identified participants, short time horizons, or with costs and benefits that will not spill over to non-participants.

In the beginning, RCTs were not used in such limited ways. The pioneers enlisting RCTs for social policy guidance in the 1960s and 1970s dealt with large questions, such as evaluating the effects of a Negative Income Tax and several major health insurance experiments. Leigh outlines the journey as evaluations became narrower. Because many of the most famous RCTs were expensive and took many years, “today’s researchers are making it a priority to show that randomized experiments can be done quickly, simply, and cheaply.”

Leigh is a fan of the modern-day randomized trials that are increasingly focused on circumscribed efforts most readily illuminated by RCTs. He tells the story of the German government sending a “cheerful blue brochure” to over 10,000 people who had recently lost their jobs, with suggestions about ways they might look for a job. Leigh reports that “Each leaflet cost less than €1 to print and post, but boosted earnings among the target group by an average of €450.” Another of Leigh’s favorite experiments found that Google users were more likely to click on a toolbar that was greenish blue rather than plain blue. Subsequently, Google randomly split users into forty groups, each of which would see the toolbar in a different shade of blue. “A Google executive estimated that finding the perfect color for the toolbar added $200 million to the company’s bottom line.”

Leigh also confesses that he used a randomized experiment to select his book title, as well as one to test whether his students’ rating of his lectures was affected by whether he wore a tie.

Regardless of the size of the application, Leigh’s basic argument is clearly correct: funders, policymakers, practitioners, and the media are attracted by the simplicity of RCTs. They are “more transparent and easier to explain” than any other form of inquiry. I would add the RCTs usefulness is limited—to determining what works among just those interventions that health reformer Don Berwick describes as “conceptually neat (with) a linear, tightly coupled causal relationship to the outcome of interest.”

Where I part with Leigh and others who have adopted the “randomista” title is that I believe it’s possible to be an enthusiastic user of RCTs without dismissing other ways of knowing, from both research methods and experience, and without seeing RCTs as the only route to better outcomes.

The good news is that an impressive collection of leading thinkers, advocates, and practitioners, as well as courageous organizations—including some funders—are taking more expansive, more reliable, and more useful approaches to gathering, building and applying knowledge.

An inspiring example of using inquiry to get beyond whether or not “it works” comes from the Carnegie Foundation’s Pathways Improvement Communities demonstration of continuous improvement. This initiative addresses the problem of the extraordinarily high failure rates among the half-million community college students annually assigned to remedial math instruction as a prerequisite to taking college courses. Traditionally, only about 20 percent of those enrolled ever make it through these courses. Anthony Bryk, president of the Carnegie Foundation, together with colleagues, encouraged faculty members, researchers, designers, students, and content experts to collaborate in creating a new network, using evidence “to get better at getting better,” to dramatically improve outcomes. They tripled the student success rate in half the time—for every racial, ethnic, and gender demographic.

Another example comes from early child development. The Harvard Center on the Developing Child is drawing on untapped knowledge from extensive research in developmental psychology, neurobiology, and implementation science. It combines this scientific knowledge with on-the-ground experience, and authentic parent engagement “to drive multiple systems in a coordinated direction (and to) provide a coherent framework for rethinking how different services could be aligned and integrated around shared goals.”

Significant movement is occurring in unexpected places around the question of what should be considered useful and credible evidence. Fewer people and organizations seem stuck in well-worn grooves. MDRC, for example, has just come out with a new paper, “Can Evidence-Based Policy Ameliorate the Nation’s Social Problems?,” in which MDRC President Gordon Berlin and his colleagues propose an updated paradigm. It would go well beyond simple verdicts about whether a program works by valuing evidence about how, for whom, and where an intervention can be effective. In a major departure, the authors support a cycle of continuous systematic learning that would encourage, rather than restrict, adaptation.

The most persuasive challenge to the contention that “one coin toss at a time” will provide the information we need to accomplish big goals comes from the work of those who are targeting whole populations. Our methods of intervention, for example, have significantly evolved since we learned about the nation’s success in reducing tobacco use. Several thousand controlled trials aimed at this objective through individual behavioral change had only marginal effects. It wasn’t until California and Massachusetts combined changes in systems, government regulation, taxation, and public health messaging that it became clear that the synergy of many components devoted to a defined result was responsible for the doubling—and then tripling—of the annual rate of decline in tobacco consumption. That evidence is prodding many large-scale changemakers to target innovative interventions on reforming whole systems, strengthening and empowering social connections, making neighborhoods safer and more stable, and putting together a range of previously isolated initiatives. If we are to build upon these efforts, we need more than randomized trials.

Surgeon and health reform guru Atul Gawande notes that “[m]aking systems work—whether in education, health care, climate change, making a pathway out of poverty—is the great task of our generation.” Surely that means we must supplement the carefully controlled, after-the-fact program evaluations of randomistas with an expanded toolkit containing many different methods of generating evidence to improve lives.