GPT-2, trained on a dataset of 8-million human-curated web pages, writes text by predicting the next word based on all the previous ones in it. One needs to give GPT-2 a line or two to get it started on any subject at all, the training dataset, consisting of outbound links from the social network Reddit, is rich enough for that.

“The model is chameleon-like — it adapts to the style and content of the conditioning text,” OpenAI researchers wrote in a blog post. The sample in the post is a surprisingly coherent story about a herd of unicorns discovered by a scientist in the Andes. Given two sentences about the find and the unicorns’ ability to speak perfect English, the machine produced what could almost be a story from any mainstream news site.

It gave the scientist a name, Dr Jorge Perez from the University of La Paz (there’s no school with that exact name, produced quotes from him and expanded on the unicorns’ appearance (“silver-white”) and language abilities (they speak a dialect of their own plus “fairly regular English”).

Of course a human editor might have had trouble with the contradiction in this sentence: “Some believe that perhaps the creatures were created when a human and a unicorn met each other in a time before human civilisation.”

The model produced this result on the 10th attempt: the more it exercises on a given subject, the more confident and coherent its output. Examples of what GPT-2 “writes” unprompted, which OpenAI released on GitHub together with a weaker version of the model, range from slightly surreal to downright bizarre.

They include a chronology of a tax scandal involving the late Senator John McCain: “Alaska Senator Lisa Murkowski became the first ‘serious’ name in the national political media drama to call for McCain to co-operate with Senate colleagues by either disclosing his tax returns or co-operating with what she called the ‘full force’ of the IRS, DOJ, FBI, etc.”

Or take this bit of a technology review: “The legendary Precision Bass brings a massive bass response and fun, smoky tone to the world! These versatile mid-bass speakers deliver incredible low-frequency extension, 32’ high-frequency response — about two-thirds of a speaker.”

Or what looks like a reported story from Bangladesh: “Dhaka — Thousands of people marched through Dhaka on Thursday, many decked in the colours of the semi-arid northern region marked by the drought-stricken region’s tallest mountains.”