The purpose of Day Two of the RIVAF Conference (“Towards a Real-Time Understanding of Emerging Vulnerability“) was perhaps best described in Assistant Secretary-General for Planning and Policy Coordination Robert Orr’s address to the group. Global Pulse, he stated, is about both information and innovation.
First, Global Pulse is dedicated to developing better information streams that can inform decision makers on how crises are impacting the most vulnerable. Second, Global Pulse identifying the opportunities afforded by innovative new technologies, tools and data to meet the information needs of decision makers. The role of Global Pulse is to build upon the expertise, energy and existing innovative initiatives within the UN to provide space for agencies to innovate for real-time impact monitoring.
On Day One of the RIVAF conference, it became clear that this requires a process of dual learning. First: understanding the existing “data gaps” faced by agencies in order to be able to address those specific challenges (What critical information are they missing? What is required to ensure that, once available, the information is useful?) Second: sharing more background information about the “new data landscape” with our UN agency partners (What is “new data”? How can it add-value and be integrated with existing data streams?).
To provide an orientation on the “new data landscape,” we dedicated the first half of the day to sharing some of the concepts, conversations, and colleagues we regularly engage with at the Pulse Lab in New York.
- First we heard a presentation by Oscar Salazar, founder of Citivox. Oscar discussed the idea of “data curation,” which he explains as the process of pulling in multiple sources of data (like SMS reports or Tweets, for example) and turning it into actionable information.
- Next, Global Pulse’s data scientist, Miguel Luengo-Oroz, presented on some of the projects that our team here in New York has been working on, including data mining for trends in social media, on-line news, and remote sensing.
- Finally, Chris van der Walt presented an early prototype of Global Pulse’s Hunchworks, which is a platform for relevant parties to collaborate around ideas (‘hunches’), and share data to support or refute those hunches.
All the presentations underlined the notion that the types of data that are tapped for analysis and/or action are largely dependent on the local context and the information being sought. For example, Oscar noted that in Monterrey, Mexico, Twitter has become a widely used source of relaying information about crime, and thus integrating that data with other sources could create a real-time heat map of criminal activity. (We will be further exploring issues in and examples of new data in a series of upcoming blogs.)
The rest of the day was spent in working groups designed to delve deeper into understanding data gaps faced by agencies. This was done in two ways. First, we tried to create a framework through which we can understand the nature of data gaps faced by agencies. Second, we solicited feedback on specific data gaps from our RIVAF partners.
We divided up the framework for understanding data gaps into three parts:
- Thematically, some data gaps exist because the data is simply not there. In this case, it is important to understand why the data does not exist. In some cases, it may be due to difficulty in collection, such as data from the informal sector. In other cases, there may be specific reasons that prevent the data from being collected, because it is secretive (as in illegal activities such as human trafficking) or because it is politically sensitive (in some countries, one example might be risky behavior among youth).
- It is also important to describe and understand the specific features of data gaps. This points to cases where data is available, but doesn’t meet the information need. These features include data quality/comparability, timeliness, accuracy, level of aggregation, geography, demographic level, or socio-economic level. One example would be data that is aggregated nationally or by district, whereas, in order to be useful, the data would need to be disaggregated in a particular manner, for example by economic sector or age brackets.
- The final element is accessibility. There are two primary ways of understanding data accessibility. First is just as it sounds—relevant data may be held but not shared by public and private sector parties. The second is the capacity to use data—in other words, even where data is available, if the tools to manage and understand it are not available, the data is useless. This may be the case with traditional sources of data, such as household surveys, and is very clearly the case with newer sources of data, for example, satellite imagery or the “fire hose” of data freely available on the web.
We also looked at specific data challenges faced by agencies, with an eye to understanding how Global Pulse could play a role:
- There are several data challenges which are equally faced by all agencies. Some of these cross-cutting challenges are geographic (i.e. getting reliable/relevant data about populations living in slums) and some are by sector (i.e. the informal sector).
- There are some specific types information would be valuable to all agencies, and could contribute to analysis in many different areas. For example, the one which came up again and again by almost every agency was remittance data.
- There are also very specific challenges that agencies face. For example, the representative from WFP mentioned that information on food moving across the border between South Africa and Zimbabwe was very difficult to collect.
The next steps for Global Pulse and our partners is to turn these conversations about opportunities in new data, and data challenges faced by agencies into work plans.