Anatomy of a Pulse Lab
Dec 19, 2013
Global Pulse’s network of Pulse Labs serve as regional hubs for big data Research & Development (R&D): developing public sector capacity for utilizing digital data sources and real-time analysis techniques for social development challenges. There are a number of data exploration projects running in the Pulse Labs at any given time. For example, in one project, Pulse Lab Jakarta researchers are working with UNICEF to find out whether analysis of social media data could reveal insights about Indonesian public attitudes to vaccination. If a reliable methodology is developed, decision-makers can draw upon real-time public opinion insights to help shape and monitor advocacy campaigns, policies and programs in the future.
Our multidisciplinary Pulse Lab teams include a mix of data scientists and analysts, legal experts, and communications and partnerships specialists. Pulse Labs design, scope, and co-create projects with UN Agencies and National Institutions who provide sectoral expertise and guidance, and with private sector or academic partners who often provide the data or analytical and engineering tools needed to execute a project. Team members work together closely, each person playing an important role in Lab research and operations.
Meet a typical Pulse Lab team:
The Chief of Research is responsible for setting the research agenda: scouting and evaluating new opportunities, and then prioritizing and supervising projects of the research team consisting of data scientists, analysts and data engineers. Our Chief of Research also spends time interacting with academic and private sector research communities to get familiar with cutting-edge big data innovations and methodologies which could potentially be adapted for Pulse Lab projects. This is a thought-leadership role and the Chief of Research speaks at conferences around the world to explain how big data matters in international development, and motivate researchers from various fields to get involved.
Working with the Chief of Research is the Data Scientist, who looks for patterns and insights in large data sets. In the Pulse Labs we are typically focused on high-volume information streams such as social media, news articles, mobile network data, etc. where correlations to real-world issues like health, poverty and disease may not be immediately obvious. Through a process known as 'exploratory analysis' a data scientist may find a particular trend visible in a dataset, leading to a hypothesis. The data scientist applies the hypothesis or theory to a particular project which is brought to the Lab by a partner. S/he leads the bulk of the investigation on a project, together with the data analyst and engineers. Finally, the data scientist works with the research team to summarize findings in a “methods paper” in order to share learnings broadly so that the projects and methods can be replicated or expanded by others. The data scientist draws from a network of academic and data-science practitioner communities in order to remain conversant with the latest studies and methodologies.
The Data Analyst works side-by-side with the data scientist to make sense of the data, and often to create visualizations using both open source and proprietary tools. It’s important for the analyst to have an understanding of the needs of our UN partners' problem statement, and the digital data source which is being analyzed, to design and execute projects effectively. S/he also needs to decide which analysis and visualization methods best fit the needs of the project. The Data Analyst works iteratively with partners in the UN system to ensure that the end product is of optimum use in helping colleagues address their problem statement.
The Data Engineer creates and maintains the technical infrastructure that facilitates the practice of data science in a Pulse Lab. In order to accomplish this, the Data Engineer chooses the optimum hardware for the tasks involved in planned research projects, looking at required processing power and data storage needs. Collaborating with the data scientists, the Data Engineer selects the appropriate software and installs the required components. Once this infrastructure is in place, the Data Engineer maintains it by ensuring there is enough capacity, performing software updates and solving any technical problems that may arise with servers or databases. Finally, the Data Engineer prepares large datasets, converting them into usable form for the analyst and data scientist. This preparation might involve writing and running scripts to remove errors, merging multiple datasets or augmenting them with additional information such as geolocation or language.
Maintaining strong partnerships is crucial for executing any collaborative big data project. A diverse group of partners is needed to execute a project, with different players bringing pieces of the data-tools-expertise puzzle to the table. Here the Partnership Coordinator comes into play. This role includes identifying and building relationships with potential partners both inside and outside the UN system, being able to identify the unique strengths and assets that each bring to the table, and helping to translate between them. The partnership coordinator needs to communicate frequently with data scientists who are up-to-date on trends in the big data industry, and can suggest new partners who can bring technology tools or data sources to bear for a project in the Lab.
When new partners are onboard, the Research Coordinator works with them and the Pulse Lab’s data science team to facilitate brainstorming sessions in order to scope and design the parameters of a specific project. Once it is kicked off, the Research Coordinator supervises timing and deliverables to make sure the project stays on track. This includes making sure that all partners involved in the collaboration are gaining maximum opportunity to learn from each-other throughout the course of the project, therefore knowlege-management functions are also critical. There are multiple projects going on at any given time in New York and in other Pulse Labs in Jakarta and Kampala and all of them need to be tracked and managed.
Research Fellows are usually drawn from academia at a postdoctoral level and are invited to work in the Lab, bringing with them their own research question and project plan. The projects focus on analysing Big Data sources (e.g. mobile phone data, social media, sensor data) to address development or humanitarian issues. The Research Fellow receives coaching and support from the Pulse Lab team in conducting the research project. In addition the Research Fellow can access advice and guidance from development experts in the Pulse Lab team who have knowledge of and contacts with UN agencies. The Fellowship Program is a vehicle for introducing new ideas and methods to the Pulse Lab and facilitating the exchange of ideas.
With the expanding use of data in a development context must come new approaches and frameworks for protecting individual privacy. For this reason, the Privacy Officer has a legal background and works to develop and implement practices that enable the safe handling of data.
The Privacy Officer observes the Privacy and Data Protection Principles, consults with external privacy experts from the private and public sectors to ensure that they are robust and follows internal privacy and data protection guidelines to embed good data handling practices into Pulse Lab operations. Before any project is undertaken, the Privacy Officer conducts a Privacy Impact Assessment designed to ensure risks to privacy are considered. Finally, the Privacy Officer ensures partners are acting in compliance with the UN’s privacy and data protection standards. In order to achieve this, the Privacy Officer consults with experts, the United Nations’ Office of Legal Affairs and other UN entities and then incorporates provisions related to privacy and data protection into partnership agreements.
Read more about Pulse Lab research and news here.