Guest Post: Data For Good, Challenges of Combining all that Data
This guest post was written by JoAnn C. Stonier, EVP/Chief Information Governance & Privacy Officer for MasterCard Worldwide. She has been recognized as an expert in the field of financial privacy, and is on the Board of Directors of the International Association of Privacy Professionals. She also serves on the Steering Committee for the World Economic Forum’s Data-Driven Development Global Agenda Council, as well as the boards of the Information Accountability Foundation and the Centre of Information Policy Leadership.
We live in a world increasingly driven by data. Data is created by so many of the activities that we undertake everyday – as we use our phones, drive our cars, shop online, and go about our daily activities. That data is used to improve daily life; bus companies use big data to better understand traffic patterns and stores use customer information to better manage their inventory so they have what we want, when we want it.
But as the world of big data continues to mature, there is another use of big data that is catching fire amongst governments, not-for-profits and NGOs. In addition to making financial donations to not-for-profit enterprises and efforts, commercial enterprises could donate their data for research. Think about it. Combining data sets from multiple commercial enterprises to solve some of the world's thorniest and difficult problems – hunger, climate change, disease, poverty. What a fabulous use of big data. How could anyone argue against this idea? This effort has a name – Data Philanthropy or Data for Good.
Unfortunately like all simple concepts, the devil is in the details. Everyone would welcome the idea of donating data if it could safely protect the identities of the individuals to whom the information is related. Sounds like table stakes, so let’s develop the right encryption protocol and de-identification mechanism. The challenge is that even if each participating company de-identifies their own data, once combined, data is often easily re-identifiable. And this is very problematic – because such a data security issue could compromise the privacy of individuals, and would certainly be in breach of a number of privacy and data protection laws resulting in fines and legal action. So combining data sets from multiple organizations that may have personally identifiable information poses real challenges and risks.
What we need is a way forward that recognizes this challenge and makes it feasible for companies, governments and NGOs committed to advancing social good to donate their data in a way that is safe, simple and smart. We need regulators to agree to an acceptable framework that allows for data sets to be combined – recognizing that when data are combined there could always be a risk of re-identification and therefore, trusted organizations are needed to ensure that data is not abused. Some ideas for consideration include:
- Agreement by all parties to de-identification of data upon submission
- Agreement by recipient organizations to never use the data submitted except for the purposes of the specific study or studies
- Set a strict time period after which the data set would be destroyed
- Create a regulatory “safe harbor” that permits use of commercial data for specified purposes as an exception to prohibitions under other laws and restrictions
- Only permit publication of aggregated information about larger populations and trends – never identifiable information.
- Conduct audits of the studies by trusted third parties
In the meantime, there are still ways to tap into the power of big data. Great organizations like DataKind are already pairing NGOs with private sector data scientists who volunteer time and expertise to help unlock value from the data already contained within these organizations.
And there are examples of companies, like MasterCard, making targeted donations of data to inform research and policy development, under strict guidelines and protocols. But, with a framework in place, the full potential of Big Data could be unlocked –helping our world become a better place.
Now that really would be a world of Big Data being used For Good.