What if that speech you just read was entirely AI generated? UN Global Pulse released a new study discussing the risks that can arise from the malicious use of automated text generation software, and the potential impacts. The study used open-source tools and data from UN General Assembly speeches to build a fake UN speech generator.
Automated text generation is being broadly applied in many domains, from marketing to robotics, and used to create chatbots, product reviews and even to write poetry. The ability to synthesize text, however, presents many potential risks, while access to the technology required to build generative models is becoming increasingly easy.
To build the speech generator, Global Pulse researchers first created a taxonomy for the machine learning algorithms using English language transcripts of speeches given by high level representatives at the UN General Assembly between 1970 and 2015. The goal was to train a language model that could be used to generate text on topics ranging from general issues such as climate change, to UN Secretary-General’s remarks, to inflammatory and discriminating speech.
“We used ‘off-the-shelf’ techniques to prove the ease with which such a powerful model can be created. The model is a type of Recurrent Neural Network (RNN) commonly used in situations when you want to predict the next element in a sequence, given previous elements. Being a ‘neural network’ means that it ‘learns’ how to make such predictions; the more examples you can give it the better it becomes,” said Joseph Aylett-Bullock, AI researcher at UN Global Pulse.
The AI model was trained in 13 hours at a cost of $7.80 paid for cloud computing resources. To generate text, it was seeded with the beginning of a sentence (these ‘seeds’ are highlighted in the examples below in bold).
The study showed that for general political topics, the model was able to match the style and cadence of real UN speeches most of the time. This type of AI generated paragraphs could easily be made indistinguishable from an official speech with minimal modifications by a person.
Category: ‘Regular’ speech-style text given generic prompting on current issues.
Nuclear disarmament has been one of the basic objectives of the United Nations. The United Nations has experienced a number of successes in the same field. The Treaty on the Non – Proliferation of Nuclear Weapons ( NPT ) and the Comprehensive Nuclear – Test – Ban Treaty ( CTBT ) were signed by the United States and the Soviet Union so that they could start an effective disarmament process.
The model performed less accurately, around half of the time, when producing inflammatory remarks, like talk of immigration or racism. This could be attributed to the formal nature of the dataset, and the lack of inflammatory language.
Category: Hateful and politically inflammatory speech
Refugees are terrorists and are taking the lives of their citizens. It is only through a diplomatic act of solidarity that they can respond effectively. It is a humanitarian duty. Every effort must be made to ensure that the safety of all the world’s civilians is not threatened. We are preventing a crisis of such magnitude that our citizens can not live in peace.
The study was intended to raise awareness about the dangers of AI text generation to peace and political stability, and to suggest recommendations for those in the scientific and policy spheres working to address these challenges.
“With this study, we wanted to bring attention to the availability of AI technology that can be used to spread disinformation, impersonate, or even write hateful and politically inflammatory speech. As a society, we need to establish safeguards against these threats at multiple levels, starting with increased awareness of the risks,” said Dr. Miguel Luengo-Oroz, chief data scientist at UN Global Pulse. “We need to develop technological solutions that can assess the veracity of human communication, and we need to create laws and regulations to prevent threats to human rights.”
The study will be presented at the AI for Social Good workshop on 15 June, 2019 during the International Conference on Machine Learning (ICML) taking place in California, USA.