Liberating 990 Data

Sharing nonprofits’ Form 990 tax data in a comprehensive and easy-to-access manner may open the door to more collaboration and innovation within the sector.

Every year in the United States approximately 1.5 million registered tax-exempt organizations file a version of the “Form 990” with the IRS and state tax authorities. While the questions vary between the version for private foundations or small nonprofits, the 990 collects details on the financial, governance and organizational structure of America’s universities, hospitals, foundations, and charities to the end of ensuring that they are deserving of tax exempt status. These organizations, which together pay $670 billion in wages and benefits annually, create America’s education, culture, art, religion, science, and provide many of the social services upon which our communities depend.

With a national movement in the U.S. to shrink the role of government, non-profits may be expected to expand their programs as they step in to fill essential needs. The role of nonprofits may now become even greater – and deserving of greater scrutiny.

The data that the IRS collects about nonprofit organizations present a great opportunity to learn about the sector and make it more effective. Yet this data could be made far more useful than it is today. It’s time to “liberate” 990 data and make it easier to gain insight into the workings of America’s nonprofits.

The IRS does make nonprofits’ Form 990 returns available, but only on DVDs for a high fee. A single year’s worth of 990s costs over $2,500, arguably to recoup the costs of pressing and mailing all these dics. But there is no reason to charge for the Form 990 data at all. Just as most people have gotten accustomed to sharing large files via a service like Drop Box, it would be simple for the IRS to publish the returns online for anyone to download in bulk for free. This week two groups committed to government transparency, Public Resource and the Internet Archive, used their own resources to post 12 years of returns online, demonstrating that it can be done.

As President Obama declared on his first day in office, “Information maintained by the Federal government is a national asset,” and IRS data on nonprofits is important and valuable information that should be available to everyone.

The DVDs are only part of the problem. Even if you can afford to buy the DVDs with Form 990 data, as some organizations and news media do, the data on them is contained in image files, which are created by scanning the printed Form 990s rather than putting their data into a searchable database. Image files are useful only for reading about one nonprofit organization at a time. The sector deserves comprehensive and computable data that can be openly aggregated, searched, checked, and analyzed.

In the long run, as a condition of being a nonprofit, organizations should be required to file the Form 990 electronically, rather than on paper, and the IRS should publish those returns in formats that lend themselves to doing aggregate analytics, creating visualizations and building analytic tools.

The IRS can start releasing in a timely fashion the data it holds that is filed electronically in computable form without waiting until all returns are electronically filed. There’s some debate about how much authority the IRS has to make changes like this on its own, and whether they would require Congressional action. Others argue that under the Freedom of Information Act, they must release the data. But we don’t need to wait for either a legal battle or for the IRS or Congress: The groups that now independently analyze IRS data can and should take the lead.

Today, the Foundation Center, GuideStar, the Urban Institute, Johns Hopkins’ Center for Civil Society Studies, and Indiana University’s Center on Philanthropy spend millions each year on converting the IRS images of the Form 990 into clean data that a computer can ingest and use to perform analysis and develop visualizations.  They’ve had to do this conversion because there has been no comprehensive set of open data about the nonprofit sector available to them or the many others who would take advantage of it. But rather than replicating each other’s efforts and then charging for access to the results, these groups could follow a more collaborative, open model. (Some of these groups are beginning to explore a collaboration.)

At least for the short term, incumbent organizations whose goal it is to provide data about the nonprofit sector and who raise philanthropic dollars to do so can stand in the place of government and make a data resource on nonprofits available. These organizations and those who fund them should take their cue from Public Resource and Internet Archive by pooling their resources and collaborating to develop a single, open and comprehensive 990 database that is available and free to all.

It will reduce the costs of data management for these incumbents and make the task of converting IRS data more efficient. And it need not threaten their revenue models: What they lose on the sale of bulk data, they can more than make up for by providing new tools and analytic services.

More important, free, open, analyzable data on nonprofits will enable more innovators, researchers, and entrepreneurs to use the data to benefit the sector. There are now many examples of public benefits that have come from “opening up” government data. When the U.S. Department of Health and Human Services published its database of hospital infection rates online in a computable format, Microsoft and Google were able to mash it up with mapping data to create an application that shows infection rates for local hospitals across the country. This tool readily allows anyone — from the investigative journalist to the parent of a sick child — to see which hospitals are safest.  The National Oceanic and Atmospheric Administration freely and publicly provides weather and forecast data online, and that data provides the backbone for such services as the Weather Channel. The GPS data we use to get from work to home were made available for civilian use by President Ronald Reagan, who saw the impact these data could provide as a public good. Cities have unlocked the data on when public transit runs and to where, making bus and subways easier to catch than ever before.

A comprehensive source of high-quality data on nonprofits, structured to allow comparisons and analyses across different organizations in the sector, would greatly enhance and accelerate research about the sector and make it possible to:

  1. Do more extensive, in-depth empirical research on the sector as a whole, including sector-wide issues such as the impact of the economic downturn on nonprofits, the geographic distribution of nonprofit services, and the efficiency of the nonprofit sector in delivering services;
  2. Combine the 990 data with other datasets, such as those on government spending, to better understand the relationship between public and private dollars in providing social services;
  3. Query the data to address issues relating to specific nonprofits, such as gaining greater insight into 501(c)(4) organizations that engage in lobbying or finding trends and outliers in executive compensation;
  4. Recognize fraud early, anticipate abuses, and target enforcement more efficiently and effectively; and
  5. Enable more people and organizations to analyze, visualize, and mash up the data, creating a large public community that is interested in the nonprofit sector and can collaborate to find ways to improve it.

Above all opening up 990 data would attract many new and innovative people who would bring energy, enthusiasm and creativity to developing tools to help the neediest among us access better services, nonprofit providers to become more effective and efficient, and everyone to understand the role of the nonprofit sector in our economy better. Instead of only the work that Guidestar’s and Indiana’s employees have the time to do, many more people could begin to create apps, develop visualizations and do research than have been able to today.

With open Form 990 data, we can expect to see again what we are now seeing in many sectors: When experts of all kinds have access to open data, it becomes a catalyst for creative problem solving and community innovation.

Tracker Pixel for Entry


  • BY David Hartstein

    ON February 5, 2013 07:00 AM

    Excellent post Beth.  I totally agree that opening up access to 990 data could have far reaching impacts that we aren’t even anticipating.  While certainly an enormous undertaking, it seems to be one that will ultimately be well worth the effort.

    Thanks for sharing.

  • BY Patrick Callihan

    ON February 7, 2013 03:20 PM

    Beth-great article. 990 data is just the beginning, but a great example of the data that this sector needs to be able to become more efficient and effective. I wonder if it would be easier if nonprofits were to submit electronic 990s instead of the paper 990s we do today? If nothing else, perhaps a subset of the data could be required electronically. At least that shifts some of the burden of making government larger in order to simply create this data. Thanks for your thoughtful article and research.

  • BY George McCully

    ON February 8, 2013 07:35 AM

    Beth, this is a very good start, and I’d like to suggest—with great respect and gratitude—a few refinements.

    First, IRS 990s are already available for free—in the downloadable annual Master Data Files for each state’s nonprofits, in the form of Excel spreadsheets.

    Second, what those show (we have examined in thorough detail the Massachusetts download) is that “nonprofit” is very far from synonymous with philanthropy and the kinds of institutions cited here: “America’s universities, hospitals, foundations, and charities…which together pay $670 billion in wages and benefits annually, create America’s education, culture, art, religion, science, and provide many of the social services upon which our communities depend.”  In fact, over 75% of “nonprofits” are not “private initiatives for public good, engaged in public fundraising for grants and donations”.  Most are self-serving and self-supporting, or otherwise of no interest to donors or grantmakers—e.g., professional and trade associations; social, yacht and country clubs; alumni associations; condo associations; physicians group practices; athletic associations; cemeteries; teachers’ retirement funds; real estate trusts; etc., etc.  Almost any attorney specializing in tax, estate, or trust law will agree that by far most “nonprofits” have nothing to do with philanthropy, and while their tax-exemption may be deserved because it is in the public interest that they exist, they are not actively engaged in societal improvement.  So we need to move beyond the crude and negative category of “nonprofit” and “tax exemption” to distinguish philanthropy—which we estimate at only about 10% of nonprofits, with another 15% in a highly complex and subtly graduated ambivalent zone yet to be sorted out—from the rest.

    Third, you are exactly right that exposure to the light will help sort-out meritorious from specious tax-exemptions.  There is a “nonprofit” corporation in Massachusetts with annual revenue of $1.4 billion, coming from one source, and many executives pulling down high six- and low seven-figure salaries (where else to put the profits, which cannot be distributed to private stockholders).  The single source of revenue is the National Security Agency.  Its “nonprofit” status is a cover for a highly-paid government agency.  This indicates that there will be political resistance in Congress to sunlight illumination.  Teachers Retirement Funds—why only teachers?  Black Lung funds—ditto.  Many others.  This apparent no-brainer reform you are proposing will be politically difficult, and that should be acknowledged at the outset.

    Fourth, to make all this philanthropic data statistically analyzable will require a systematic taxonomy of philanthropic fields, which the NTEE is not (it was designed for other purposes than rigorous research, by other means than systematic thought).  The Catalogue for Philanthropy has developed, over 15 years of practical experience at the ground level in Massachusetts, the only systematic comprehensive taxonomy for philanthropy (and nonprofits, for that matter), branching out logically into 200 fields from three which cover all possible human relations: Nature (the physical environment), Culture (what we make and do), and People (how we treat each other)—

    In sum, philanthropy is on the verge of a scientific revolution, if we can liberate the data from the fog of “nonprofits”.  Your work is essential to that strategic effort, for which, thanks. 

  • Rick Schoff's avatar

    BY Rick Schoff

    ON February 8, 2013 09:10 AM

    Data availability isn’t the fundamental problem, that is merely a question of when and how. Lack of data standards will continue to be THE problem, and this includes all the taxonomic issues (which go beyond which particular list of terms can win the day). And from Day One, the evidently invisible elephant in the room is that the players in (what is heuristically called) “the field” will NEVER agree to use a shared taxonomy. A system must be developed that circumvents the “Procrustean Bed” problem. Criticizing or promulgating a particular taxonomy does nothing toward attaining that goal. This is the fundamental issue, and no one involved is willing to invest in addressing it.

  • BY George McCully

    ON February 8, 2013 10:37 AM

    I would propose another attitude than Rick’s: that we respond to the systematizing pressure of current and future technology constructively, collaboratively, and optimistically, as a community of scholars and practitioners. 

    The Catalogue’s systematic taxonomy is offered only as an example of what philanthropic studies need to organize data for systematic, rigorous, analyses of field-sensitive data.  We hope others will try to develop a better taxonomy; if they succeed—and we hope they will—we shall adopt theirs.  The goal is excellence.  There is no “Procrustean Bed” problem in authentic scholarship or science, especially during a paradigm-shift, which is our current situation.

  • Rick Schoff's avatar

    BY Rick Schoff

    ON February 9, 2013 06:53 AM

    I couldn’t agree more with George. And his more recent comment helped me better focus my fundamental concerns. Source information is rarely excellent, and there is insufficient investment to “organize data for systematic, rigorous analyses of field-sensitive data.”

    Once 990 data is readily available to anyone in machine-readable form, it will be deemed inadequate for most purposes. Only legal mandate allowed for nonprofit organization information to become organized, but only for IRS purposes.

    The science of information management needs to be applied to these long-standing issues.

  • Adrianne Showalter Matlock's avatar

    BY Adrianne Showalter Matlock

    ON July 31, 2013 11:42 AM

    Beth, Thanks for this helpful article. I have been butting my head against the wall trying to access data I need for what I initially thought would be a straight forward research project. As a PhD student, the current fee of $500+ to download analyzable data from the Business Master Files makes the data inaccessible to me.

    George mentioned that the IRS’s Master Data Files are already available for download in Excel spreadsheet format. However, I have been able to find this data only for the most recent year, as posted on the “SOI Tax Stats- Exempt Organizations Business Master File Extract” webpage. The Business Master File contains all the variables I need, but without data from more years, I am unable to do the longitudinal analysis necessary to answer my research question.

    George- do you have any further information on accessing the Master Data Files in Excel format for the past 20 years?

    Despite the shortcomings in current nonprofit data regarding NTEE classifications and community benefit legitimacy, wider access to existing data will provide researchers to data sufficient for initial analysis, upon which stronger research can be built over time.  Beth, please keep us posted if any developments occur in the availability of this invaluable data! Thanks again for your article.

  • Hi Adrianne,

    I was wondering if you’ve had any progress with finding information for a longer time period than just the most recent one? Your help would be much appreciated, as I am currently going through the same issues.

Leave a Comment


Please enter the word you see in the image below:


SSIR reserves the right to remove comments it deems offensive or inappropriate.