Investing in AI for Good

IDinsight enumerators demarcate an area of a field to estimate agricultural yield in Telangana, India. Data from surveys like this are a critical input into agricultural machine-learning models.

In the past 10 years, hundreds of projects have applied artificial intelligence (AI) to creating social good. The right tool applied to an appropriate problem has the potential to drastically improve millions of lives through better service delivery and better-informed policy design. But what kind of investments do AI solutions need to be successful, and which applications have the most potential for social impact?

Characteristics of Strong AI-for-Good Investments

AI excels at helping humans harness large-scale or complex data to predict, categorize, or optimize at a scale and speed beyond human ability. We believe that more targeted, sustained investments in AI for social impact (sometimes called “AI for good”)—rather than multiple, short-term grants across a variety of areas—are important for two reasons. First, AI often has large upfront costs and low ongoing or marginal costs. AI systems can be hard to design and operationalize, and they require an array of potentially costly resources—such as training data, staff time, and high-quality data infrastructure—to get off the ground. Compared to the upfront investment, the cost of reaching each additional user is small. For philanthropies looking to drive positive social impact via AI, this often means that AI solutions must reach significant scale before they can offer a substantial social return on investment.

Another reason why targeted, sustained funding is important is because any single point of failure—lack of training data, misunderstanding users' needs, biased results, technology poorly designed for unreliable Internet—can hobble a promising AI-for-good product. Teams using AI need to continually refine and maintain these systems to overcome obstacles, achieve scale, and maintain the ecosystems in which they live.

Developing a Framework to Assess the Impact Potential of AI for Good

To narrow in on AI use cases that offer the most promise, our team at IDinsight synthesized existing research from the United Nations, McKinsey & Company, nonprofit practitioners, past Google.org work, and other groups in the social sector. From there, we identified about 120 use cases across 30 areas where developers are using AI to address social and environmental problems.

Using a detailed framework, our team then analyzed which of these areas will most likely lead to significant social impact. In addition to potential risks, this framework looks at:

Size of potential impact. What is the breadth of a solution or set of solutions—how many people can it help? What is the depth? For example, does it modestly increasing beneficiaries’ income or save a life?
Implementation feasibility. Is it possible to build an algorithm of sufficient accuracy that outperforms the status-quo, non-AI option? What is the differential impact of AI compared to other approaches to solve this problem? Is high-quality data necessary for building the algorithm available or can it be obtained?
Opportunity area synergies. Are there many different entities currently working on similar problems? Are there fixed, catalytic investments that would have a non-linear impact in enabling various other AI for good interventions? Is there a pre-eminent organization that could build and deploy initial solutions at scale in partnership with a funder?

Best Bets for Investment

As we looked through the use cases that scored highest against our framework, three criteria—large impact potential (depth and breadth), differential impact compared to non-AI tools, and a clear pathway to scale—stood out as useful shorthand to explain why certain areas are uniquely primed for investment. We also considered whether each area had sufficient proof-of-concept evidence illustrating its feasibility, as well as manageable risks that investment and careful modeling can safely overcome. (The full framework outlines a process for more robust and precise analysis.)

Our analysis pinpointed three specific areas that appear optimal for near-term investment: medical diagnostic tools, communication support for marginalized communities and languages, and agricultural yield prediction. It’s important to note that these are not the only areas that AI could drive significant social good. Other areas we analyzed that scored well against our framework included medical research/drug discovery, natural disaster response, supply chain forecasting, and combatting misinformation. While we don’t detail these areas here, we encourage others to explore them. Here’s a closer look at our top three areas:

1. Point-of-Care Diagnostic Tools for Low-Resourced Medical Systems

In some low- and middle-income countries (LMICs), where the health-care provider to patient ratio is low, many patients fall through cracks. Under-diagnosis or misdiagnosis of dangerous conditions is common due to traditional tests that are expensive or unavailable due to laboratory requirements; time-intensive testing, which overburdened workers may not conduct; and the stigmatization of certain health conditions, which dissuades many patients from getting tested in public clinics.

Moreover, health-care workers often need more training than they receive to accurately diagnose and treat health conditions. Poor diagnostics seem to greatly constrain the improvement of health-care outcomes in low-resource settings. For example, when the Center for Global Development simulated theoretical improvements to maternal and child health outcomes in sub-Saharan Africa under optimized clinical conditions (no shortage of drugs or absent health-care workers), health care quality only marginally improved.

There is a strong case for investing in AI tools that diagnose or screen for common conditions at the point of care. Many of these tools are already at the proof-of-concept stage and work with smartphone cameras or microphones to capture sounds, images, or video that could aid diagnosis. And while smartphone penetration among frontline health workers in LMICs is low (with significant variance across countries), it’s expected to grow rapidly in the next few years.
AI diagnostic tools have:

Large impact potential (depth and breadth). Tools that achieve scale could improve the health care of millions of people in low-resource settings.
Differential impact compared to non-AI tools. For many critical health conditions, particularly in settings with under-resourced laboratory facilities, the status quo for diagnostics is poor. Algorithms often have a comparative advantage over low- to medium-skilled practitioners in analyzing audio files, images, and video.
Pathway to scale. After regulatory approval, these interventions can scale with the support of entities like the World Health Organization, national governments, and large-scale networks of community health workers.

The most impactful diagnostic tools screen for underdiagnosed or misdiagnosed, treatable conditions that affect lots of people. AI tools for diagnosing many of these conditions—such as respiratory conditions like tuberculosis or asthma, malnutrition (including infant anthropometrics), anemia, and cervical cancer—already have promising proofs of concepts. However, developers still need to validate and adapt these technologies so that they are practically useful for health workers.

Funders should also consider “ecosystem investments” that enable the creation of equitable AI tools—for example, training datasets that are accurate, representative of the populations they would serve, and collected with informed consent. Privacy platforms, where health-care organizations can securely store and share training data, are another type of valuable ecosystem investment. (Nightingale Open Science is building a platform to do this for some health conditions like cancer and cardiac arrest). These investments can make a significant difference in how well AI tools serve the populations they seek to reach and shouldn’t be overlooked.

As with any medical device, global health groups and regulatory agencies need to guarantee that AI tools meet common quality standards. They must ensure that algorithms are trained on representative data and are rigorously evaluated for fairness in the settings where they will be deployed. This is particularly important given that many health-care AI proofs of concept are built on non-representative data or data collected in laboratory settings, not in real-world contexts. If we are to realize game-changing advances and guard against potential risks, it’s important that philanthropies invest in correcting for these shortcomings.

2. Communication Tools That Support Marginalized Communities and Languages

Millions of people around the world are excluded from public services, education, the job market, and the Internet at large by virtue of their inability to speak majority languages. Just 10 of the 6,000 languages used in the world today make up about 87.3 percent of all online content. More than half of the content on the Internet is in English, and even some of the most commonly spoken languages in the world (including Arabic, Hindi, Bengali) don’t make the top 10.

Language barriers can cause extreme, acute harm during legal proceedings, medical visits, and humanitarian emergencies. Hospitals, social service agencies, immigration lawyers, schools, and natural disaster response systems use translators to provide services, but in most cases, too few translators are available to meet translation needs globally. And while translation and automated speech recognition models have made tremendous headway for majority languages—one of Google.org’s AI Impact Challenge grantees, TalkingPoints, for example, helps non-English speaking parents in the United States communicate with their children’s teachers—support for minority languages needs more investment.

Innovation in this space can take many different forms. One is datasets and tools that make machine translation available for more language pairs, such as the translation of Bhojpuri to English. Another form is improved translation for specific subdomains in existing machine-translatable languages, such as the improved translation of French or Arabic medical terms. Innovation can also happen with tools that extend beyond translation to enhance the usability of common, natural language processing tools in multiple languages, such as sentiment analysis tools. Each of these is primed for investment because they have:

Large impact potential (breadth): Lack of access to information and services due to language barriers is a problem that affects hundreds of millions (and in some domains, billions) of people.
Differential potential for AI: There are not enough human translators (likely less than 1 million professionals) to meet the global need for translation services (hundreds of millions or billions of people). Automation through AI is one of the few avenues to scale.
Pathway to scale: Translation tools have the benefit of being “direct-to-consumer” products, meaning they can scale up via integration with existing digital platforms.

One promising opportunity for investment is improving general translation services for languages that many people speak but that are underrepresented in existing translation models. Many languages with millions of native speakers don’t have access to translation for their languages on common platforms. Wired Magazine noted several of the biggest in a 2018 article: "Bhojpuri (51 million people), Fula (24 million), Sylheti (11 million), Quechua (9 million), and Kirundi (9 million).” Even within existing languages, general translation quality varies substantially.

Another opportunity lies in domain-specific translation improvements—that is, improvements to translation models for specific contexts. These models require accurate machine comprehension of jargon that may not be common in traditional, natural language data and could be most helpful in settings where translation heavily impacts individuals, such as helping migrants understand legal barriers when entering a new country or disenfranchised people navigate standardized government processes. It will be important to balance greater access for people with the potential risk of inaccurate translation.

Finally, most language models are based on convenience samples of data that happen to be available on the Internet, which can exacerbate biases. It’s imperative that any large-scale investment in under-resourced language data is done in partnership with native speakers and that members of civil society help guide which texts to use for model training. The representativeness and accuracy of AI translation and communication models depend on it.

3. Agricultural Yield Prediction in Smallholder-Dominated Regions

One difficult but essential factor contributing to sustainable food systems is accurate and timely estimates about agriculture yields. These estimates are extremely important to making informed policy decisions that provide farmers with the support they need and ensuring that millions of people have access to food.

In affluent countries, satellite-based, yield-prediction algorithms—trained on administrative and farm-reported data about land use, plot boundary demarcations, and planting timelines—provide these estimates at periodic intervals throughout the growing season. This allows farmers to make better planning decisions. The algorithms help them get the right agricultural inputs (including hybrid seeds, fertilizer, and pesticides) to the right fields at the right time, and allows governments to more nimbly respond to shocks such as droughts and disease.

But in LMICs, where smallholder farms dominate agriculture, yield prediction isn't straightforward. Smallholders' plots are small, irregularly shaped, and frequently have more than one crop, making them difficult to identify or classify in remote-sensing imagery. Analog alternatives such as using crop-cut experiments are expensive, slow, and fraught with measurement challenges at scale. As such, farmers and government policy makers often make decisions without critical information on the state of agriculture.

Today, satellite imagery is increasingly available to the public, and has the high resolution and update frequency required to make predictions at the smallholder level. For example, the Sentinel-II satellite collects imagery for nearly the entire planet every five days at about 10 meter-per-pixel resolution. Beyond the satellite imagery, acquiring the training data to build AI models requires substantial, ground-level data collection upfront—a labor- and time-intensive prospect involving farm visits and crop cuts in remote, rural areas.

High-quality research has nevertheless demonstrated the feasibility of using satellite imagery to estimate yields of smallholder farmers using publicly available imagery. These proofs-of-concept are generally limited to specific crops in specific regions, but with greater training data and model-building efforts, they could have:

Large impact potential (breadth and depth): More than a billion people are either employed in agriculture or directly depend on agricultural earnings to survive. Agriculture makes up a considerable portion of many LMIC’s gross domestic product and is nearly always the sector that employs the most people.
Differential potential for AI: Using current, non-AI methods, it’s not possible to cheaply measure agricultural productivity or accurately forecast agricultural yield before harvest in LMICs.
Pathway to scale: Yield prediction AI tools can scale in several different ways. They can inform high-level government decisions that impact millions, or expand via large-scale agricultural extension organizations or direct-to-consumer mobile applications.

Example investment opportunities include programs that recommend tailored agricultural inputs or inform macro-level government agriculture policy to increase or decrease food imports. As in other applications, the lack of training data is a major constraint to conducting yield prediction at scale in LMICs. With many different organizations and researchers working on related problems, there’s a need for collection and labeling standards. Initiatives like the ML Hub by the Radiant Earth Foundation will be important in hosting and sharing the data that all model builders need to create the next generation of AI-based yield forecasting models.

In addition, as training data from crop cuts becomes more widely available, funding the creation of pre-trained algorithms that perform reasonably well “off the shelf” for common crops will be valuable. Similar to Google’s BERT for natural language processing or VGG19 for image classification, pre-trained, open-source models can help data scientists focus on tweaking high-performing models to their use case, rather than starting from scratch. With proactive philanthropic investment, funders can insist on better, more representative training data and pre-trained models that are built with the needs of a diverse array of small-holder farmers in mind.

We offer this framework and analysis as a conversation starter, rather than a final verdict. AI holds tremendous promise to improve millions of lives around the world by proving the tools to directly combat health, communication, economic challenges. Investing in solutions that address large-scale social problems, tap into the unique comparative advantages of AI, and have clear pathways to scale is a good place to start—though they may require patient capital. Developing useful, scalable AI tools is hard and requires a sustained commitment to building datasets, systems, and user-centric applications that can help solve societal challenges. By staying the course and seeing promising technological innovations through to scale, investors can unlock inordinate social value.

Read more stories by Ben Brockman, Skye Hersh, Brigitte Hoyer Gosselink, Florian Maganza & Micah Berman.