Big data has become a buzzword for private, public, and social sector organizations. For the social sector, there is a belief that “big” data is the new panacea to solving our greatest social challenges—whether criminal justice, health care, education, or international development. On the other side, there is concern about the cost of collecting data, the type of data we collect, and the real questions of privacy and ethics of data use. We believe that data has the potential to help governments (local, state, and national) achieve real outcomes, but we need to ensure that we are collecting useful data, and governments need to put in place some practical safeguards before asking the public to invest in new systems and data collection. We need to examine the value of transparency of big data; understand the types of data needed to achieve outcomes; differentiate the differences between data, evidence, and judgment; and ensure that citizens are included in the conversation.
Why Data Matters
In the sciences, and increasingly in the social sciences, data has been a critical part of understanding, testing, and proving theories. It has the potential to more-effectively address critical challenges in our society—to target school interventions, improve health care, or help people find the right job training. Our ability to collect, analyze, and better understand data has become increasingly easier and cheaper. Even with limited resources, we can now collect micro-level information in real time, detect early warnings, and provide insights for effective, targeted interventions. Community- and heat-mapping techniques, for example, provide a wide range of valuable information, helping us better understand crime patterns and isolate hyper-local health conditions. In Chicago, data is helping Chicago Health Atlas identify health trends and provide hospital information. And Foodborne Chicago is using sentiment analysis (determining whether a piece of writing is positive, negative, or neutral) from social media and location-based 311 reports to detect food poisoning incidents. Data can help government provide better and more-effective services for its citizens.
Investment in Data
The two largest sectors collecting data are governments and companies. Companies collect data about individuals’ behaviors and consumption patterns to provide better consumer services. Government, however, has a responsibility to improve the lives of all citizens by providing better and more-effective social services, solving social challenges, and protecting security and privacy. Government already collects a tremendous amount of data ranging from weather, health, education, to financial data, and federal government has made tremendous progress in making data more transparent, but it needs to do more, and transparency alone is not enough. Government and civil society groups need to develop sound principles for releasing data in more-useable formats, and to invest in technology, people, and new systems to collect, analyze, and use data to provide better services to citizens. These systems have a potential for a large return on investment.
Many cities around the world are already doing this, and results are increasingly promising. In New York City, for example, the Mayor’s Office of Data Analytics partnered with the Department of Technology, Telecommunications, and Technology to build DataBridge—a citywide, data-sharing platform that enables 20 agencies to share information from a variety of sources. DataBridge empowers agencies to conduct cross-cutting analysis, and allows external organizations such as the National Weather Service and research universities to both contribute and use the city’s data.
Types of Data
While data collection and sharing is important, it is even more important to discuss the outcomes we want to achieve to ensure that we are collecting the right data. Currently, government and philanthropic systems collect largely compliance data—whether they served the same number of people or held the same number of meetings. Rarely, do we ask whether we achieved outcomes—whether we increased the number of people learning or the number of healthier children. To assess outcomes, we need to collect different types of data, better integrate the data we collect, and ask different questions.
In Northeastern Ohio, the Poverty Center is developing data-driven tools to address social policy on issues like neighborhood stabilization. The Poverty Center holds the most comprehensive, publicly available neighborhood information system in the United States, and uses it to track and measure foreclosures and building code enforcements. Its data includes information that otherwise resides in silos, such as census, crime data, or mortgage-lending data. Creating a centralized location makes it easier to answer policy questions that require data from multiple issue areas.
Data vs. Evidence
Meanwhile, the social sector needs to have a larger conversation about the role of data and evidence. They are not interchangeable. Evidence largely evaluates whether programs are working or not working—for example, an after-school program may help students learn. Data, on the other hand, helps us better understand whether efforts are actually moving the needle—for example, there may be 15 different after-school programs, all with positive evaluations, but the high-school drop-out rate may stay the same. In this case, collecting a variety of data helps a school district or city government better understand what factors besides after-school programs might be affecting school learning. If we want to see visible outcomes, we need to pay attention to the difference and change the conversation.
Values and Citizen Engagement
There is also little conversation about the value of systems built around collecting data, the use of data, and the role of citizen engagement. Most conversations are about governments, companies, or organizations providing better services based on data. But does this empower citizens? Can citizens hold government, companies, or philanthropies accountable if data unintentionally causes greater harm? For instance, what if a company uses data to profile low-income or minority communities, and provide them with more-expensive credit? If access to credit is important for economic mobility, how do we avoid the downsides? Who is responsible for the application of fair and ethical use of data?
One example of how data might empower citizens is the law Mayor Bill de Blasio signed to provide inmates with a bill of rights when they are detained. This data, sitting in New York’s Department of Corrections, could inform citizens about who is held in jails; average incarceration times; average bail amount by ethnicity, race, and income; and more. The data can help citizens and policy makers better understand the effects of the criminal justice system on various communities, highlight disparities, improve that system, and achieve positive outcomes such as lower crime and recidivism rates. It is important for policy makers to evaluate how companies and others use data, otherwise many will to continue to prey on citizens or unfairly target citizens. We need a smarter conversation about the values underlying the use of data and how to regulate it.
Data is not biased, but how we use it has strong implications. How we use data, what for, and why requires judgment. Whether for an algorithm or data analytical tools, it requires intentionality and understanding of its effect. Currently that judgment is in the hands of the programmers, agencies, or philanthropies that pay for the data. Government and civil society need to develop standardized principles, and continually assess data systems for their effectiveness and protection of civil liberties. This series will explore the need for citizen voice and values-driven systems.
Ultimately, big data is a promising and powerful tool for driving large-scale impact, but using it effectively and responsibly in government and beyond requires intentionality. The examples above illustrate how targeted interventions can create a better data ecosystem for social impact. However, we must proceed carefully and develop robust principles to guide data initiatives within a broader socio-political context.