Sex disaggregation of social media posts

PARTNERS: Data2X, University of Leiden

PROGRAMME AREA: Real-time Evaluation 

LAB: Pulse Lab New York

DOWNLOAD SUMMARY: UN Global Pulse, 'Sex-Disaggregation of Social Media Posts,' Big Data Tools Series, no. 3, 2016 [PDF]


The plight of women and men differ in many ways and one step towards understanding those differences is sex-disaggregation of available data. Global Pulse collaborated with Data2X and the University of Leiden to develop and prototype a tool to infer the sex of users. The tool automates the process of looking up public information from Twitter profiles, in particular the user name and profile picture. Using open source software, the tool analyses user names from a built-in database of predefined names (from sources such as official statistics) that contain gender information. User name alone may sometimes not be enough to discern sex, in which case the tool analyses profile photos, using face recognition software. 

Global Pulse used the sex-disaggregation tool to improve an existing real-time online dashboard showing the volume of tweets around priority topics related to sustainable development. The tool was tested on more than 50 million Twitter accounts and you can view the results of applying the tool at:

By embarking on the development of this prototype tool, and testing it on Global Pulse’s dashboard of global development tweets, the tool shows several early examples of the insights that can be gleaned about the differences between how men and women discuss global development on social media.