Unlocking the Power of Data with Data Lakes

Three figures standing on a laptop keyboard looking at data in a pie chart and graph

Technology has afforded nonprofits access to a tremendous amount of data to raise revenue, run their operations, and pursue their mission. But access is not the equivalent of understanding the full story that data can tell.

Next Generation Nonprofits

Cloud technology provides the innovations necessary for nonprofits to more effectively serve their constituencies, increase donor and member engagement, and enhance stakeholder experience. This supplement shares insights into how your organization can use this technology to jump-start your success. Sponsored by AWS

It is easy for organizations to make sense of their data when they only have to deal with a few data sets. However, the process becomes more difficult when additional data sets are added, particularly when those additional data sets come from tools purchased from different vendors and that have their own unique data structure. When this happens, traditional spreadsheets and databases are insufficient. Simple databases can crash under the weight of too many users, and large, complex spreadsheets with thousands of lines of embedded calculations can freeze or run maddeningly slow.

Fortunately, data lakes allow users to store, manage, clean, transform, and analyze different data sets in one place. To be clear, a data lake is not a data warehouse—the main difference being the order in which data is loaded and transformed. Traditional data warehouses require data to be transformed first, and only data that adheres to a specific format and schema are accepted. This generally means that not all data can be stored, especially newly acquired or unstructured data sets, leading organizations to lose access to that data. Data warehouses are also typically just storage facilities. They don’t contain the tools that allow for data cataloging, reporting, governance, advanced analysis, and machine learning.

Data lakes, on the other hand, allow data to be stored immediately and create or change schema over time. They also come with a complete complement of modular tools for access control, cataloging, and analytics, which together enable organizations to easily customize their data lake for their specific needs.

At Share Our Strength, a US nonprofit committed to ending childhood hunger, we addressed the challenges of managing our many discrete fundraising data streams by initiating the process to implement a data lake in 2021. This has put us on a path to replace the organization’s numerous independently managed spreadsheets with a single, accessible, shared system of record. It has reduced the burden on our database administrators by providing automated, real-time reporting. It has further prepared us to use advanced concepts in AI and machine learning to create better predictive models of donor behavior to improve our fundraising. Also, the data lake has made us more cyber-secure by giving us the capability to grant or restrict access to sensitive data sets on a granular level.

The technical journey from identifying the need for a data lake to implementing one was relatively simple. The cloud-based tools for storing, transforming, cataloging, and analyzing our data as well as controlling access to it were readily available from AWS. Far more difficult than the data engineering, however, was the cultural engineering required to bring new teams together, build trust in a new way of doing business, and agree on a new set of standards for managing our data.

Cultural Reengineering

During the process we learned that the real work in building a data lake is in reengineering the way that the organization thinks about its data. Our team estimates that we spent 80 percent of our time building consensus across teams, and only 20 percent of the time designing and implementing the actual infrastructure.

One of the first areas of cultural reengineering was to make the IT department a better partner to the fundraising department. Prior to when I joined Share Our Strength as its CIO in 2017, the IT department focused almost exclusively on server maintenance and desktop support, and—although we had technical skills to offer—we were never part of the conversation of how to better manage data. Therefore, my first step was to expand the services the IT department offered to include building and maintaining the organization’s data lake. In doing so, my department offered a new source of creativity, capability, and perspective to our business units’ efforts to use data and technology wisely.

A second aspect of our cultural reengineering was building trust between the IT and fundraising departments. This effort took time and patience for people coming from different disciplines to learn each other’s language and to trust each other’s expertise. For example, when the IT department demonstrated the data lake’s capabilities, we were met with some skepticism because the analyses looked different from the Excel spreadsheets the fundraisers were familiar with. At the same time, we in the IT department also needed to guarantee that we were demonstrating capabilities germane to the fundraisers’ current priorities, such as automating data-cleaning tasks that occupied an outsize part of their day, instead of demonstrating capabilities that would only be potentially useful in the future, such as finding unique trends in donor behavior. It took several iterations and rounds of clarifying questions before we began to understand each other’s perspectives and were then able to establish a prioritized list of problems to solve.

Finally, our organization embarked on a third area of cultural reengineering: developing a shared understanding of how our data was captured and structured. Often, the rules for storing and structuring data made sense for one fundraising team, but those same rules did not easily translate and apply to the work of other teams. This led to irrelevant data being stored in some structured fields, a lot of important data being stored in comment fields that did not easily lend themselves to analysis, and a realization that sometimes our analyses were incomplete because they did not fully take into consideration the nuances of how or where our data were stored. Data cataloging, which is a central feature of data lakes, gave us the tools and opportunity to identify and resolve these irregularities and ultimately improve our analyses.

Share Our Strength is only a year into using a data lake to improve our fundraising, but we are already seeing tremendous gains. We are beginning to measure returns on various investments with greater specificity because we are including more and better data streams, which in turn allows us to make smarter decisions. We are also working to automate complicated yet repetitive data transformations and data pulls—saving our donor operations team time as they support major- and mid-tier fundraising officers in multiple markets. Finally, our more consistent data standards are helping us to better understand donor behaviors, which has allowed us to have more meaningful communications with them.

The data lake contributed to each of these successes, yet the lake itself was only possible thanks to our nonprofit’s cultural transformation. For organizations looking to take a similar journey to improve their fundraising capabilities, our advice is to build trust within the organization and then iterate, iterate, iterate, as there are likely several unknown opportunities for innovation within your own data sets.

Read more stories by Richard Kostro.

sponsored

Technology

Data Lakes

Next Generation Nonprofits

Smart Nonprofits

Cloud Transformation

Data-Driven Crisis Analytics

Data Lakes | 2

Accelerating Mission Impact

Beyond the Gallery Walls

From the Cloud to Clean Water | 1

Next Steps & Resources

Cultural Reengineering

Next Generation Nonprofits

Smart Nonprofits

Cloud Transformation

Data-Driven Crisis Analytics

Data Lakes | 2

Accelerating Mission Impact

Beyond the Gallery Walls

From the Cloud to Clean Water | 1

Next Steps & Resources

Create a free SSIR account to access this content.

This article is free.