Combining “Big” and “Small” Data to Build Urban Resilience in Jakarta

6 min read

Interview with Etienne Turpin and Tomas Holderness, directors of

The SMART Infrastructure Facility project aims to help communities tackle the chronic problem of flooding in the Indonesian capital, using a combination of crowdsourced data, social media and big data analysis.

Pulse Lab Jakarta (PLJ) conducts Big Data for Development research and so we were interested to hear more from directors Etienne Turpin and Tomas Holderness  about their perspectives on citizen engagement and the role that big and small data can play to increase urban resilience.

PLJ: How did you come up with the idea of

Etienne: Jakarta has the fastest rate of urbanization in the world, but it also has an incredibly high rate of social media usage. Conceptually, we wanted to see if we could crowdsource data about this complex urban system through social media networks to help produce strategies to respond to infrastructure problems that come along with rapid development. Pragmatically, the project developed out of work that Etienne was already doing about flooding in the kampungs of Jakarta and Tomas’s work using network models to improve sanitation in Nairobi.

We came together to design and implement as a means to gather, select, and sort information that we glean from social media. Because citizens use social media with an incredible frequency to talk about flooding, we realised that if we could make use of this data in a safe and anonymous manner, we would have tremendously valuable data to help us map and model the floods in real time.

Before mobile phones and social media were ubiquitous in the urban landscape, such widespread information was very difficult to collect or simply didn’t exist. We treat mobile devices and the social media platforms they support as potential data resources within the urban system; they are not merely relaying social noise if the latent data they contain can be made intelligible. This requires moving from noise to knowledge, which we attempt to do through a GeoSocial Intelligence Framework

PLJ: What exactly do you mean by a “GeoSocial Intelligence Framework”?

Tomas: Social media data contains geo-located information. For example, through a mobile phone’s GPS, a user can connect their message to a specific location on the Earth’s surface. Taken individually, these messages are normally not very interesting; however, when we combine a higher number of similar geo-located points about a specific topic, we can begin to visualize patterns within complex urban systems that were previously undetectable.

We can see how many people are talking about the same situation, and where and when these conversations are taking place. We have developed a GeoSocial Intelligence Framework to help us analyse data patterns in the infrastructure of complex urban systems that is latent with social media. (e.g. making a real time map of flooding in Jakarta by collecting Twitter data).

Another important aspect of the framework is that it makes data anonymous, so that we can’t see who said what; we are not interested in individual users, only what is happening in the city as a system.  In the same way that democratic voting is done anonymously, we anonymise the data to avoid any retribution that might result from citizen reports about flooding or the causes of flood events.

To summarize, our GeoSocial Intelligence Framework is one way of integrating a crowdsourcing methodology, social media data, and big data analysis to enable evidence-based, real-time decision making about urban infrastructure. We are how pilot-testing this framework as a means to coordinate the response to extreme weather events, including flooding in Jakarta, but it is transferable to other cities, other languages, and other urban issues, like waste and sewage management, traffic congestion, etc.

PLJ: Many people see “small,” citizen-generated, crowdsourced data and big, algorithmically-generated data as antithetical. Your project instead tries to “mash” the two. What do you think big data can contribute to crowdsourcing efforts?

Etienne: We are forming large, detailed data by integrating information from a number of sources, including citizen-generated, crowdsourced projects and tweets, to build a cohesive picture of flooding across the city in real time. Big data analysis tools help us make sense of this complex “mesh” of information which often contains unstructured or semi-structured components.

Through our work with different groups in Jakarta, we’ve shown that we need to understand urban infrastructure as a set of spatially and temporally interconnected systems. By bringing data together from different projects and communities, we can analyse the connectivity between different infrastructure and how these components might respond to flooding.

This is particularly important when discussing rivers, for example, because we need to understand how the practices of one community upstream may impact another community or infrastructure component downstream. With 13 rivers and over 1,100 km of waterways, Jakarta has an especially complex hydraulic network. By integrating and visualising data throughout the whole urban system, and ultimately throughout the entire watershed, we can generate a much more comprehensive, evidence-based understanding of how to prioritize improvements and strategically manage infrastructure.

PLJ: You spent quite a lot of time with local communities in Jakarta and in other cities in Southeast Asia to scope out your project. Do you think that both “small” and big data can help improve their resilience? 

Tomas: The work of ethnographers such as Anna Tsing influenced our research; for Tsing, the development of participatory resource co-management practices is a key to improving resilience. While her work has considered these co-management practices in predominantly rural areas, we are interested in exploring the potential for civic co-management practices within complex urban environments. We are interested in how citizen-led mapping initiatives can be integrated into our network models through new crowdsourcing methods.

Traditionally, it has been quite tricky to implement conventional models of the urban system in rapidly urbanising cities like Jakarta. To understand resilience we need to model how the city will respond to specific events, such as flooding. We are tackling this challenge in a novel manner by integrating a multitude of “small” crowd-sourced data and new big data sources (such as tweets) to help build up a picture of the city’s infrastructure during the monsoon season.

Our aim is to use this data to improve our models of the urban system so that we can recommend and implement better strategies for adaptation. This is particularly urgent as changing weather systems mean that Jakarta is experiencing more intense and more frequent periods of flooding. We have been working with communities to gather and share information and we want to involve these from the very start because our models are only as good as the information we can put into them! We also share our findings with these groups to help them make more informed decisions about their response.

We believe that a GeoSocial Intelligence Framework designed to integrate both small (citizen-led, bottom up) and big data sources can produce the most useful system for studying and promoting urban resilience.

Dr. Etienne Turpin @turpin_etienne is a Vice-Chancellor’s Postdoctoral Research Fellow at the SMART Infrastructure Facility, University of Wollongong; he lives and works in Jakarta, Indonesia, where his research helps co-produce strategies for community resistance and social resilience among informal settlements of the urban poor. He co-directs the  research project.

Dr. Tomas Holderness @iHolderness is a Geomatics Research Fellow at the SMART Infrastructure Facility, University of Wollongong; his research focuses on the use of geospatial analysis, Earth observation, and network modeling techniques applied to urban infrastructure resilience and Earth systems engineering. Dr. Holderness co-directs the research project.

Giulio Quaggiotto @GQuaggiotto is Pulse Lab Manager of Pulse Lab Jakarta, in Indonesia

Image: Jakarta floods by World Bank licensed via Creative Commons

Did you enjoy this blog post? Share it with your networks!

News, thoughts and ideas about big data and AI, data privacy and ethics from across the Pulse Lab Network. Read more on the blog.

Scroll to Top