Document Type
Publication date
Author(s): UN Global Pulse


Predicting forced displacement is an important undertaking of many humanitarian aid agencies, which must anticipate flows in advance in order to provide vulnerable refugees and Internally Displaced Persons (IDPs) with shelter, food, and medical care. While there is a growing interest in using machine learning to better anticipate future arrivals, there is little standardized knowledge on how to predict refugee and IDP flows in practice. Researchers and humanitarian officers are confronted with the need to make decisions about how to structure their datasets and how to fit their problem to predictive analytics approaches, and they must choose from a variety of modeling options. Most of the time, these decisions are made without an understanding of the full range of options that could be considered, and using methodologies that have primarily been applied in different contexts – and with different goals – as opportunistic references. In this work, we attempt to facilitate a more comprehensive understanding of this emerging field of research by providing a systematic model-agnostic framework, adapted to the use of big data sources, for structuring the prediction problem. As we do so, we highlight existing work on predicting refugee and IDP flows. We also draw on our own experience building models to predict forced displacement in Somalia, in order to illustrate the choices facing modelers and point to open research questions that may be used to guide future work.