This text was originally published by UNHCR Innovation Service. Main image by UNHCR Innovation Service.
How a project combines data modeling with collaboration across UN agencies to anticipate Venezuelan refugee movement into Brazil.
Venezuelans have set up camp just outside the border with Brazil, waiting for it to reopen. That’s how eager they are to cross into the country as so many others already have — or at least they did, until the Brazilian government closed the border in March 2020 in the early days of COVID-19 to help prevent the spread of the disease. There’s no telling how many others are just waiting for the chance to leave Venezuela for Brazil, to escape both political and economic turmoil in search of a better life.
Before the border closing, the UN Refugee Agency (UNHCR) was registering an average of 540 refugees per day who crossed the border. But once the border shut down, with no one knowing for sure when it might reopen, a vitally important question arose: How can UNHCR predict how many refugees will come to Brazil from Venezuela once the borders reopen — and what does UNHCR need to do to be prepared?
UNHCR works closely with the Brazilian government, which leads Operation Welcome, the government-led response to the ongoing arrival of Venezuelan refugees into the country. Starting in 2019, and as part of regular emergency preparedness activities, UNHCR Brazil and other UN agencies were creating a contingency plan to prepare for any unexpected increase in new arrivals or other scenarios that might require a rapid response.
Of course, no one’s contingency plans included COVID-19 — but it quickly became integral to UNHCR Brazil’s, says Arturo de Nieves, Senior Field Coordinator at the Sub-office in Boa Vista, which includes territories very close to the border.
“Even closing the border is an extraordinary situation,” he explains. “So we started to rework the contingency plan to prepare for the reopening of the border — and to address the challenge of refugees who are managing to cross the border anyway, because it’s a very long stretch of land for the government to patrol.”
UNHCR and Brazil have agreed on ways to register and assist the refugees that are still entering the country. But the most urgent question remains: What will happen when the border reopens? Will COVID-19 mean more people fleeing Venezuela or fewer new arrivals? And how can UNHCR Brazil best prepare for an outcome in an unprecedented scenario?
It was in this context, de Nieves says, that the Head of Boa Vista Sub-office, Oscar Sánchez Piñeiro, reached out to both UNHCR’s Innovation Service and UN Global Pulse (UNGP), the UN Secretary-General’s digital innovation initiative that uses data and analytics to understand the needs of displaced people and get real-time feedback on how well policy responses are working. Both groups quickly coordinated and responded to de Nieves.
“We wanted to have an estimate,” he explains. “That’s when we started working on the models, because we wanted to have some tools that might give us an idea of how the influx of refugees will take place after the border reopens.”
Using data to anticipate scenarios in an uncertain future
This isn’t the first time UNHCR has employed what’s known as predictive modeling, and it certainly won’t be the last. Predictive modeling uses relevant data points — such as historical information about particular times people tend to flee their homes and why — to indicate patterns that can be turned into models. These models are used to predict future patterns.
UNHCR’s Innovation Service had the experience of launching UNHCR’s first predictive modeling project in Somalia. Known as Project Jetson, it’s an ongoing experiment that uses machine learning techniques to make predictions based on historical data. The results are represented on a dashboard that produces models to predict the displacement of people in Somalia’s different regions.
Although no two situations are exactly alike, there are common threads such as the drivers of displacement, including conflict, hunger, extreme weather conditions, lack of medicine, or the pursuit of a sustainable livelihood, to name a few. But not every country has the same operational context.
In Brazil, for example, under normal conditions, refugees are registered at the border reception center and then are free to move around the country and integrate into the population. Only those who are especially vulnerable live in UNHCR shelters for a period of time. But if there’s a rapid arrival of refugees — including those who are currently waiting right outside the border — will the shelter capacity be enough?
“The work is divided into three projects,” says Rebeca Moreno Jimenez, an Innovation Officer (Data Scientist) with the Innovation Service. “One is working on predictions for when the border reopens, another is estimating how many people are currently already in the different cities in real time, and the other project is on simulating shelter capacity. How many people do the shelters cater to? What if people aren’t able to move or people are infected with COVID-19 and the area needs to be divided within the shelters? They are trying to design better reception conditions and ways to take good care of people when they arrive.”
Approaching data collection creatively to fill informational gaps
To build a model that can answer these questions, a team from UNGP began looking at what data was available to them, given that COVID-19 is an extraordinary situation and there is a year’s worth of refugee arrival data that’s incomplete since the borders closed. So the team had to look back to 2017 through 2019, as well as reviewing other known data about why people tend to flee their homes.
“The reasons people leave are a baseline: economic hardship, conflict, political tensions, food and water shortages — factors that are constantly driving people to depart,” says Katherine Hoffmann Pham, an Artificial Intelligence (AI) researcher with UNGP. “Then there are COVID-related factors, which are going to be a bit more speculative.”
For example, she says, data from 2018 and 2019 can be useful in tracking the common reasons people leave, such as protests, which typically correlate with departures. But because no one can know for certain how COVID-19 will impact the movement of people from one country to the next, UNGP is working on gathering more data for what Pham refers to as “nowcasting,” using sources such as Twitter, satellite, and public radio data that reflect what’s happening in real-time.
There’s other data that can be used to conduct predictive modeling in scenarios where there isn’t complete data, adds Sofia Kyriazi, an Associate Innovation Officer and Artificial Intelligence & Data Engineer for the Innovation Service, who also worked on data collection and analysis. Monitoring bus schedules people might use to get to the border — are they still running or are they changing? — is one example. The Innovation Service also uses other data sets such as Google Trends and Facebook public data to look at the population density in areas near the border, including how many people might be camping there.
“We rely a lot on our colleagues in Brazil, who have additional information such as the fact that people get paid every two weeks, which might mean they would have the money to flee at that time,” she says.
“All of this data is especially important in a situation like this, because we don’t have very much historical data or even context within migration and displacement in a global pandemic,” adds Catherine Schneider, an Assistant Innovation Officer with the Innovation Service.
According to Moreno Jimenez, having as much accurate data as possible is essential to predictive modeling. In fact, she says it’s unethical to build models with incomplete data, which is why the project team had to be creative about gathering data beyond their typical resources.
“COVID is affecting the collection of humanitarian data sets,” she explains. “Normally we’d have people from field operations or other humanitarian agencies out gathering population data, but everything was completely locked down for three or four months. Eventually, they started gathering the data again, with masks and social distancing, but there’s still a huge chunk of data from everywhere that’s missing from part of 2020.”
The modeling has continued, though, using proxy data from as many sources as possible to make the data as complete as it can be under the circumstances. This includes what Pham calls a queueing simulation tool of people at the border regions, although the impact of COVID-19 still remains a bit of an open question, which could skew the number of arrivals by perhaps 20% either way.
“What are the different consequences for the ability to shelter them and house them and the need for supplies?” she says. “This tool can say, ‘Here are the operational consequences of the variations we’ve modeled.’”
After modeling a number of scenarios, UNGP creates visualizations of the data to make it easier for UNHCR personnel who aren’t data science experts to interpret. That work is done by Patricia Angkiriwang, a Visual Communications and Design Intern at UNGP.
“It looks like an Excel table, with rows for different scenarios and plans,” she explains. “You can toggle the parameters to see how the plan might change if, for example, 1,000 people come to the border when you’re used to registering 500 people per day. The visual component helps you see what may happen over time.”
Bringing people into the equation
As powerful as data and predictive modeling can be in future planning, human engagement is essential, too, both in terms of field expertise and the impact of the modeling recommendations on people of concern.
The project incorporates a human-centered design approach, which puts people at the center of the work as much as possible. Angkiriwang uses the queueing model as an example.
“We first needed to get an understanding of how people move through the system,” she explains. “When they arrive, what happens, how do people know if they need a shelter or not, do they wait outside or inside the border while they’re in line? How does the process work? This involved many conversations with our UNHCR partners, including Rebeca, to get a perspective of what happens on the ground, which is where the collaboration aspect really shone.”
Everyone agrees the partnership between UNHCR Brazil, UN Global Pulse, UNHCR’s Global Data Service and the Innovation Service was central to creating the models quickly and making them something the field operation could put into real-world use.
“There are two stories to be told: One of UNHCR staff that are actually working on the ground to make the decisions and then of course the second story is of the people on the ground who are crossing,” Pham says. “The best tool we’ve had is working with these types of interdisciplinary teams — and the Innovation Service was part of that, because they have a good understanding of the UNHCR operation and the context on the ground, in terms of building that qualitative human component of the story.”
According to Moreno Jimenez, the UNHCR team in Brazil is “fantastic” about obtaining the data needed to build the models. “They’re the ones who meet with us, who tell us their concerns, give us data sets, and are doing the advocacy with the government so we could move on this project,” she says.
“There were a lot of minds working at the same time,” Kyriazi adds. “Everyone brings a specific specialization, which is really informative for the rest of the team.”
Putting the data models to use, now and in the future
Although the models are being built to anticipate the border reopening, they are still useful in the current moment. According to de Nieves, trying to count and register new arrivals entering Brazil despite the border being closed has been challenging.
“We have the estimate for the influence of reopening the border, but now we are trying to come up with models to help us understand how many refugees are in Brazil and where they are,” he says. “Because the border is closed, we don’t have the same control. We want to have more control of the situation and to be able to plan in advance, both now and after the border reopens.”
Of course, everyone is looking forward, because the real proof of the models’ effectiveness will be how they compare to reality when the border reopens.
“When the border opens we want to test the models once we actually start having numbers,” Kyriazi says. “We’ll fine-tune the model using the new numbers to make sure our numbers are correct. After all, we’ve never been in this situation before.”
According to Sánchez Piñeiro, one of the positive outcomes was for the Army to take on the team’s displacement predictions and methodology. And working with such a diverse group of colleagues from other agencies also brought some unexpected benefits.
“UNHCR collects huge amounts of data but rarely is able to fully analyze it,” he says. “By working with Global Pulse we were able to start looking at displacement patterns and particular characteristics of refugees and asylum seekers. There are seasonal variances that are important to be able to provide gender-, age-, and time-specific humanitarian support. We are also able to better target information based on the preferred journey. Currently, we will be able to provide artificial intelligence support for refugees on the move based on their specific location. People would be able to receive information on their social media account based on their location that will allow them to have access to documentation, livelihoods, interiorization, or shelter services.”
From de Nieves’ perspective, this initiative has been an excellent one that’s both innovative but will hopefully become a mainstreamed solution across UNHCR.
“We are confident this tool will be useful for other contexts,” he says. “It’s a very good example of collaboration between different operations within the UN. We are confident these tools will be well-established in the near future, so we can use them for other operations around the world to help us predict the movement of refugees — using all the tools at our disposal to be as efficient as we can.”