It’s one of the most frequent questions we get here at Global Pulse: “where do you stand with this whole ‘crowdsourcing’ thing?”
We have always believed that crowdsourced information could be usefully combined with statistical data from various UN crisis early warning systems and the new kinds of “information exhaust” being generated passively by people as they go about their daily lives. But the devil’s in the details.
My CrisisMappers colleagues have argued from the beginning that real-time reporting on the impacts of crises by the affected population themselves must be a foundational methodology for how Global Pulse functions at the country level. If you want to learn to listen to the voices of the vulnerable, they said, you must give them a voice. It’s hard to argue with that logic.
At the same time, our colleagues in the official statistics community rightly point out that governments cannot make policy decisions based on information from untrusted sources. While acknowledging the validation delays inherent in statistics and the need for real-time data, they point out that crowdsourcing from the masses tends to yield messy, unverified reports that may amount to little more than hearsay – or worse. The general consensus is that it would be irresponsible to use this kind of information as a substitute for high-quality, validated statistical data –especially when it comes to deciding where to adjust social safety nets during crises. It’s also hard to argue with that logic. Information crowdsourced from the masses ought not be relied as the sole source upon which to base policy responses to the kinds of slow-onset development crisis that Global Pulse is looking at.
So we had a bit of a challenge. Our starting point was the 39+ existing UN sector-specific early warning systems and a growing number of mobile phone-based data collection efforts by UN agencies. All of these were potentially useful information sources, but they weren’t enough to get us there. Our mandate was to provide decision-makers with cross-sectoral information on the plight of the vulnerable that was actionable, accurate and real-time. Some of the other potential kinds of data on the list were:
- Real-time information exhaust of unknown utility,
- Real-time crowdsourced information of unknown accuracy, and
- High-quality survey data that is years out of date.
Yikes. It’s a bit like that old business adage, “Fast, cheap and good: you can have any two.” Where to go from here?
The breakthrough came when we realized that there was already a highly successful model to draw on in the field of public heath: outbreak investigation. Public health institutions everywhere monitor real-time data for anomalies that might represent disease outbreaks. Outbreak investigation teams from Ministries of Health, for example, collect and analyze reports on sightings of dead birds, sales of certain medications, and case reports on hospital patients presenting with certain symptoms to detect outbreaks of potentially dangerous strains of avian influenza. When a pattern begins to emerge in weekly reports coming from a remote village, for example, a team is sent in to investigate. Their first step is to confirm that an outbreak of some kind is indeed underway. Once they have established this fact, they move to the verification step, sending samples to a diagnostic laboratory for analysis. When the lab results come in and government experts know what they are dealing with, they initiate an appropriate response to contain the outbreak and treat those affected. So outbreak investigation follows a process of detection, investigation, verification and response in which real-time “circumstantial” evidence is monitored for anomalies that might trigger subsequent investigation and targeted collection of scientific evidence.
Why couldn’t Global Pulse adopt a similar model, but apply it to cross-sectoral monitoring for impacts in agriculture, nutrition, livelihoods, education, etc.? When families affected by external shocks begin coping by cutting back on meals, pulling their kids out of school to work in the market, or foregoing medical care, could we detect enough of a signal early on to realize that something bad could be happening, investigate through crowdsourcing of citizen reports, and then verify the nature and severity of the impacts through rapid impact assessments? If a similar process could work for Global Pulse, it could help leaders gather the hard evidence they need to intervene with agile, targeted policy responses to protect these families from downstream impacts such as malnutrition, lack of education, or health problems.
The more we thought about it, the more sense it made: in lieu of an unsustainable and impractical approach to real-time monitoring based purely on household surveys, we could adopt an agile, reactive, targeted, and phased approach that could still allow government ministries to monitor a large fraction of the population. Figuring out exactly how to do this will be the work of our Pulse Lab teams, beginning in Uganda in the next few months, but we know that our monitoring approach will incorporate three phases:
1. DETECT: government monitors real-time information exhaust for anomalies.
All over the developing world, people are increasingly using mobile phones to access services for banking, sending money to family, cashing in food vouchers, sharing prices of agricultural products, buying and selling goods, furthering their education, and seeking information on every topic imaginable. We believe governments can analyze information exhaust generated as a by-product of the use of these services to detect the early impacts of crises on vulnerable populations. Google’s “Flu Trends” works this way, by looking for signature pattern in the information exhaust created by online searches, such as increases in searches for terms like “fever.”
In order to understand what even constitutes an anomaly, we’d need to establish a baseline and perhaps incorporate contextual data (e.g. social practices, market data, remote sensing, etc.) from a variety of sources. We could then conduct a retrospective analysis of historical data sets to learn to characterize the “signatures” of known previous impacts on different population groups. When a particular impact began to be felt at the household level back in 2009, for example, what changes in collective behavior could be observed in patterns of buying, selling, transferring funds, or seeking information? Once we know how these early impacts manifest themselves in usage patterns of programs and services, we will be able to train software to detect this signature the next time it shows up and alert government officials of potential problems.
While it’s certainly true that many of the most vulnerable populations have access to cell phones, there are many UN programs that specifically target these populations, and analyzing information exhaust based on how these programs are used will, we believe, also yield useful results. The World Food Programme, for example, runs school feeding programs that collect school attendance metrics. Yet this information is only used to monitor program performance and is not published. We believe such program-generated information exhaust could be quite useful for vulnerability analysis. In other words, there are ways to reach populations with no mobile phone access. Finally, it’s worth noting that it will take our country-level Pulse Labs a few years to develop and refine the analytical methodologies required here before the model is ready for broader adoption, and a few years from now, many who do not have mobile phones today will have them. Mobile penetration is accelerating in developing countries. There is plenty for us to learn in the mean time.
2. INVESTIGATE: government elicits initial reports directly from vulnerable communities.
Once a concerning pattern has been detected in the information exhaust being generated by a particular community, government officials could blast out text messages to a network of citizen reports to initiate an investigation. The idea here is to confirm either that the community in question is, in fact, being impacted, or that there are widespread public perceptions that this is the case. This might be initiated via a mass broadcast to a randomly selected population sample, but it would likely be more useful if it were targeted at trusted network of preselected citizen reporters. Imagine if government ministries had a directory with the mobile phone numbers of all of the community heath workers, teachers, radio station hosts, and youth volunteers and could ask, “are there serious food shortages in your community, or has food become unaffordable? How are you coping? What do you believe is causing these problems?” They could even request that members of this network actively elicit more information from their patients, students, audience and parents.
Note that this scenario doesn’t really qualify as crowdsourcing according to some definitions , because it involves reaching out to a trusted network rather than the masses. Ushahidi’s Patrick Meier has referred to this technique as “bounded crowdsourcing.” We believe a bounded approach could have significant value as an intermediary confirmation step to help justify deployment of an onsite rapid assessment team to gather solid statistical evidence. There may also be cases where unbounded crowdsourcing (i.e. from the masses) would also be useful, though the Pulse Labs will need to experiment with these techniques in different contexts to determine whether they are appropriate. And it’s not hard to imagine that over time certain citizens contacted via unbounded crowdsourcing would prove their credibility in the eyes of government though consistently accurate reporting, to the point that their first-hand reports might come to carry more weight than those from unknown sources.
3. VERIFY: government sends in a rapid impact assessment team
Should citizen reports confirm that the population under investigation does at least perceive that they are being significantly impacted, the next step would be to send in a team to gather the hard statistical evidence on the nature and severity of the impacts and establish their underlying causes. Armed with hard evidence that they would otherwise never have known they needed to collect, government would now be in a position to implement an appropriate policy response – hopefully within a matter of weeks after the initial signal was detected.
To sum up, we believe that we work with governments to develop an approach where they are able to fuse real-time information exhaust and statistical contextual data to detect anomalies, use selective elicitation of citizen reports from trusted sources to confirm that an onsite assessment is required, and then send in a team to get the hard evidence. So the answer is a provisional “yes” — we do believe there is a potentially useful augmentative role for crowdsourcing (in one form or another) in Global Pulse, provided that our Member State partners are game to explore the utility of this technique through the Pulse Labs, which will be trying out a variety of emerging technologies and experimenting with many different approaches to real-time monitoring. Platforms like Swift River and Riff, which were initially developed for sudden-onset emergencies, may also prove useful for the development crises Global Pulse is focused on. Now it’s time to find out what actually works!
We believe the phased investigative approach I have described here has the potential to help accelerate our transition to world in which development decisions, like those in the private sector, are based on real-time evidence. Yet there is much to learn, and there will be many, many iterations on the ground as we go from concept to implementation. Along the way, the technical, methodological and political challenges will be significant, not to mention the clear need to address concerns around individual privacy, data security, data sovereignty, and intellectual property fully and transparently. We’d love your feedback and your ideas on approach as we move forward.