Alfredo Morales is a Postdoctoral Researcher at the New England Complex Systems Institute. David Pastor Escuredo is a Researcher at Universidad Politécnica de Madrid (BIT / itdUPM). In 2014, they collaborated with UN World Food Programme, Global Pulse, the Digital Strategy Coordination Office of the President of Mexico, and Telefonica Research on a project "Using Mobile Phone Activity for Disaster Management During Floods." A project summary is available online here and as a downloadable pdf, and a full technical paper describing the methodologies was published at the IEEE Global Humanitarian Technologies Conference 2014.
In recent years, analysis of mobile phone data has received great interest from scientific researchers, government agencies and businesses across many sectors. Big data analysis can also reveal signals and proxy indicators that reflect certain human behaviors which could be valuable to sustainable development and humanitarian work
. These include the structure of social networks, the way information flows among people and the reaction of a whole collective of people to external and critical events like shocks, natural disasters and extreme climate conditions.
This new opportunity to quantify trends in human behavior opens the door to unprecedented research in the social sciences. Historically, social research has been constrained by the size and quality of traditional survey samples. By contrast, big data analytics enables data scientists to track and analyze many millions of human interactions, minute by minute, providing a remarkably large picture of how a society functions. These data are generated as a by-product of users communicating with each other by phone and are routinely held by service providers.
Mobile phone data provides unfiltered signals of the way people behave, and this behavior is highly determined by their economic and cultural circumstances
. Translating mobile data analysis into information that can be used for evidence-based decision-making requires contextual cultural and sectoral knowledge as well as knowledge of the local tech landscape.
Despite the clear opportunities provided by the volume and veracity of mobile data, a number of factors impact upon interpretations of mobile data analysis:
- Data applicability: primarily, the research must transform data to insights that could improve humanitarian action and policymaking. This process depends on aligning and linking the analysis with key real facts.
- Data availability: it's important to observe technological penetration in the communities of interest and as this data is held by the private sector, the type of subscribers generating the data being analysed and their communication habits.
- Data distribution: data may not be be distributed homogeneously in the region being considered thus introducing a bias that should be considered when interpreting comparative metrics. In order to exploit data analytics for humanitarian action and maximise utility for decision-making, one must also triangulate with ‘ground truth data’ and contextual knowledge to truthfully characterize social phenomena measured with mobile phone data.
Using Mobile Phone Activity For Disaster Management
By necessity and design, Big Data for Development research is interdisciplinary in nature. To explore the potential of mobile phone data for characterizing people’s reactions to natural disasters and to try to improve public policies regarding crises management (early warning and disaster resilience), Global Pulse gathered an interdisciplinary team composed of researchers from the Technical University of Madrid (UPM) – us – and Telefonica Research, together with experts from the Digital Strategy Coordination Office of the President of Mexico, and the United Nations World Food Program (WFP). The project was also conceived as an experiement in time-limited public-private collaboration where the data analysis, design and execution.
Using the 2009 Tabasco Floods in Mexico as a Test Case
Telefonica Research and the UPM research team had access to analyze millions of aggregated and de-identified mobile phone datasets from 2009 and 2010. As the data scientists on the project team, we were assigned to identify and frame a high-impact study case. We therefore performed two actions in this stage:
- Gathering socio-economic information on shocks between 2009 and 2010 in Mexico (news, surveys, Government records, civil protection action summaries, etc.)
- Gathering phone data statistics to measure data usefulness and identify potential data sources to generate a pool of data to help interpretate trends
After several iterations and the help of Global Pulse, WFP and the Digital Strategy Coordination Office of the President of Mexico in defining the scope of the project, we narrowed in on the floodings in the Tabasco region in 2009, which had a deep impact in the economy of the region
and were studied by UNDP
Although several previous studies have attempted to characterize shocks, such as earthquakes
, this project is innovative and challenging in the sense that it investigated an event that happened 5 years ago with minimum direct information at our disposal, so in addition to the analytical work, we needed to pose some critical questions and use contextual data to evaluate the representativeness of the data to iteratively establish links between analysis and facts.
The project garnered some interesting results:
- The telecom’s clients form a representative sample of the Tabasco population compared to census data in 2009.
- The mobile phone data contains accurate signals which identify and locate abnormal patterns when a flooding of this magnitude occurs (see ‘How Mobile Data Analysis Works’ section, below).
- Changes in the maximum peak in the volume of calls serve to identify the main affected area. To validate this result, LANDSAT data from NASA was used to obtain a segmentation of the flooded areas covering the interval of time of our phone data. Additionally, news and reports were used to confirm the floods.
- The civil protection warning may not be an effective way to raise people’s awareness in the case of flooding. Here we observed the synchronization of trends in people’s activities (data-deduced human behavior variable) with the rainfall levels during the floods (external fact variable) and the alerts triggered by civil protection warnings (objective human action variable). This strategy allowed us to hypothesize a lack of awareness of the population during the days of maximum precipitations. There was no discernable impact as a direct result of the civil protection alert, which is a useful indication of the utility of some diffusion strategies promoting early response in the case of a possible disaster for a specific region.
HOW MOBILE DATA ANALYSIS WORKS
Communication activity is determined by the number of phone calls made by each user while communicating from a given location, identified by the nearest telecom carrier antenna. By looking at the set of antennas that serve users through time, we can create mobility network maps from individual trajectories that depict the flow of people moving across a region.
And by studying spatial and temporal variables, we were able to estimate population distribution, identify critical areas and observe abnormal temporal patterns that might be used to predict events.
This project is a promising illustration of the potential of big data as a new tool for humanitarian action and human development. It has been also a good test of the challenges that arise in conducting this type of research, from the methodological effort to generate actionable insights from mobile phone data to the organizational complexity of aligning private and public stakeholders with different backgrounds and interests.
Our experiences demonstrated the value of analytical work being set in a social and geographical context, to create a hybrid social-analytical scientific methodology. This hybrid made it possible to exploit mobile phone data to gain insights into a natural disaster that occurred in 2009, identifying the most damaged areas, learning about affected populations and estimating the level of the population’s awareness of risk. Although extensive validation and standardization work is needed, these types of tools could be implemented as real-time indicators within existing risk management workflows for monitorization and planning and to improve civil warnings or resources allocation.
Interdisciplinary collaboration uniting government, international organizations, academia and private sector is the key to utilizing and expanding the realms of big data for a better understanding of societies and to inform decision-making processes. Frameworks of collaboration and data sharing will be critical in turning big data into a key resource for improving actions and policies.
Top Image: Huitzil from Villahermosa, Mexico – Via Méndez Anegada under Creative Commons