Written by Thijs Brouwers
Last February, OpenAI created a big fuzz with a blog post about their impressive language model, GPT-2, which is said to be capable of generating coherent and surprisingly human-like text documents. OpenAI claims it achieves state-of-the-art performance on many language modeling benchmarks without any task-specific training. The big fuzz, however, was not (only) about those impressive results, but mainly about OpenAI’s decision to keep the model that generated them a secret.
As they mention in their blog post, OpenAI only released a much smaller version of GPT-2 because the larger model raises concerns about malicious applications of the technology, such as generating deceptive, biased, or abusive language at scale. With this choice, they hoped to start a discussion in the wider AI community about the implications of such systems.
I am currently doing my graduation project at BrainCreators about text summarization, inspired by and building forth on the advances of language models over past years. The post of OpenAI therefore immediately caught my attention. Now that the dust has settled somewhat, we will discuss some of the responses to their decision not to release the model. Before we do that, however, let’s first look at an example of what GPT-2 is capable of.
SYSTEM PROMPT (HUMAN-WRITTEN)
In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.
MODEL COMPLETION (MACHINE-WRITTEN, 10 TRIES)
The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science.
Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved.
Dr. Jorge Pérez, an evolutionary biologist from the University of La Paz, and several companions, were exploring the Andes Mountains when they found a small valley, with no other animals or humans. Pérez noticed that the valley had what appeared to be a natural fountain, surrounded by two peaks of rock and silver snow.
Pérez and the others then ventured further into the valley. “By the time we reached the top of one peak, the water looked blue, with some crystals on top,” said Pérez.
Pérez and his friends were astonished to see the unicorn herd. These creatures could be seen from the air without having to move too much to see them – they were so close they could touch their horns.
While examining these bizarre creatures the scientists discovered that the creatures also spoke some fairly regular English. Pérez stated, “We can see, for example, that they have a common ‘language,’ something like a dialect or dialectic.”
Dr. Pérez believes that the unicorns may have originated in Argentina, where the animals were believed to be descendants of a lost race of people who lived there before the arrival of humans in those parts of South America.
While their origins are still unclear, some believe that perhaps the creatures were created when a human and a unicorn met each other in a time before human civilization. According to Pérez, “In South America, such incidents seem to be quite common.”
However, Pérez also pointed out that it is likely that the only way of knowing for sure if unicorns are indeed the descendants of a lost alien race is through DNA. “But they seem to be able to communicate in English quite well, which I believe is a sign of evolution, or at least a change in social organization,” said the scientist.
From a language generation perspective, the above text is nothing less than astonishing. Let us quickly highlight some surprising aspects of this piece, both factually and semantically. The model has knowledge about the fact that a unicorn has a distinctive horn, even though it seems to make a mistake by calling them four-horned. Furthermore, it knows unicorns are a mythological animal, and names them after the Roman poet Ovid. The fact that GPT-2 connects an ancient poet who famously wrote about myths to mythological unicorns exhibits a level of contextual awareness that is very rare in language generation. This is as surprising as it is impressive.
From a technical perspective, the model has gained its knowledge through learning to correctly predict the next word from a short text prompt. This is a common approach in language generation models. The training corpus here consisted of 40GB of natural language snippets scraped from the internet, and the model itself is a large transformer-based language model with 1.5 billion parameters. By giving it some arbitrary input in the form of one or several sentences, the model starts generating synthetic text samples, adapting to the style and content of the conditioning text. This is what we see happening in the example.
Looking at the text with our Natural Language Processing-hat on, several more things catch the eye. In the paragraph about the discovery that the unicorns spoke English the model successfully constructs a grammatically difficult sentence: in the quote from Dr. Pérez “We” is the pronoun of the main clause, correctly referring to Pérez and the scientists, while “they” is the pronoun of the subclause, correctly referring to the unicorns. This is a fine piece of coreference resolution, typically considered a very difficult problem in NLP.
Another aspect that jumps out at us is the correct use of quotation marks, which also requires contextual awareness. In fact, every paragraph in the example shows masterful use of context and meaning. “They were so close they could touch their horns” is not only semantically correct, but seems to show an understanding that this is a relevant, even exciting piece of information for the reader. Similarly, the correct combination of the words “speaking”, “language”, and “dialect” is surprising to see.
There is much more to marvel at, but as a final example, let’s consider the co-occurence of the concepts of ‘descendants’, a ‘lost race of people’, the ‘arrival of humans’, ‘origins’, ‘creation’, a ‘time before civilization’, and ‘humans and animals “meeting” each other to produce offspring’. Correctly combining these words in meaningful paragraphs and sentences is a truly baffling achievement for a language generation model. To then top it off with the politically incorrect remark that “in South America such incidents seem to be quite common” almost pushes it to stand-up comedy. Perhaps some of us can be forgiven to think this is a hoax.
Now that we have seen the surprising feats of the model, let’s do exactly what OpenAI wanted, and discuss their decision. Their blog post sparked an intense debate in the AI community, with opinions spread far apart. Many researchers find it strange that a company like OpenAI makes a decision of non-disclosure. After all, what’s in a name..?
Some say that this non-disclosure tactic generates hype and propagates fear, while thwarting reproducibility and scientific scrutiny. They consider AI progress to be largely due to open source code and the open sharing of knowledge. OpenAI is doing the opposite, according to them: OpenAI tries to take the moral high ground without explicitly demonstrating that their model is actually capable of generating the malicious content that they claim to fear.
Even when accepting OpenAI’s claims, it is still reasonable to ask why they decided to start the discussion on moral threats from AI by describing a model that makes these threats real. They’ve shared the recipe to reproduce their model, and doing so is fairly easy if one has access to a few AI engineers, some time, and a sufficient amount of computation power. (Put simply, this is because it’s just a commonly used model, applied on an immense scale.) OpenAI could be causing the very same threat that it’s warning against by being even slightly open about GPT-2.
Though this criticism may be apt, it misses the main argument OpenAI put forth. As Jack Clark, policy director at OpenAI, told The Verge : “Eventually someone is going to use synthetic video, image, audio, or text to break an information state.” Co-founder of Fast.AI Jeremy Howard agrees: “I’ve been trying to warn people about this for a while. We have the technology to totally fill Twitter, email, and the web up with reasonable-sounding, context-appropriate prose, which would drown out all other speech and be impossible to filter.” This is of course something the AI community should watch out for, and with that in mind, we think everyone can agree with the intent of the decision of OpenAI.
But agreeing with the intent does not mean agreeing with the decision itself. The open source culture of the larger AI community is one of the reasons it has progressed so rapidly as a field, and by changing that culture, AI actually could become as dangerous as many people fear. We can see this by comparing OpenAI’s decision to keeping a zero day private in cybersecurity.
A zero day is a software leak that the discoverer did not share with the software creator. If the discoverer decides to misuse this leak, it becomes a zero day exploit. These exploits are so-called because they give the users “zero days” notice before becoming active, giving “zero days” to create a patch.
Such zero days are mainly interesting for three groups of people. Criminals use zero day exploits to crack devices of individuals, companies or organisations. Security companies have an interest in zero days, as knowledge of them helps the company shore up its security gear. Finally, the governments of some countries are eager to possess zero days, as exploiting them makes them able to proceed unseen in cracking devices of suspects, and even in espionage. The fears for such practices became real two years ago, when WikiLeaks announced they had acquired information about CIA hacking tools, which were found to contain many zero day exploits.
Clearly, zero days pose a danger to cybersecurity: if an institution knows about them, then people with the wrong intentions are able to get information on them as well. In order to keep the private information of many unsuspecting users safe, zero days should not be exploited. Rather, it would be better if organisations like the CIA announce zero days to companies instead of keeping them secret, as only then the companies can patch them.
The same could be the case for GPT-2. By publicly sharing the model, the world could have learned a way to handle its implications. Instead, the world is now frightened of the potential dangers but has no tools to guard itself against them. Although openness might be too much to ask of the CIA about zero days, it is a valid thing to ask of OpenAI about GPT-2. By publicly sharing the model, the wider AI community could learn, indeed, would have to learn how to deal with the threat it poses. Now, as with zero days, the world is scared of the threat but cannot defend itself.
AI innovation is going to continue at a fast pace on many fronts. As potentially malicious parties are going to obtain a working GPT-2 model sooner rather than later, stifling our ability to invent a defense against AI technology by keeping it secret is not the right way forward.