After days of speculation over hard-coded anti-goblin bias from OpenAI, the company had to release an official memo on 'Where the goblins came from'
🖥 PCNews
#AI#Software

After days of speculation over hard-coded anti-goblin bias from OpenAI, the company had to release an official memo on 'Where the goblins came from'

PC Gamer RSS FeedJustin Wagner📅 May 2, 2026(about 17 hours ago)

Summary

Creatures rule everything around me.

Tuesday, a report from Wired dug into a strange instruction patched into Codex CLI, an AI coding tool: "Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query." I'm always whispering this to myself so I don't get kicked out of Dollar Tree again, but it's a weird thing for an AI model to have to be specifically told.

It was, apparently, distractingly prevalent: one X post quoted in that article notes it frequently referred to bugs and "gremlins" and "goblins" and continued to following the update that was meant to curb the goblin talk. OpenAI has broken its silence on the matter, and published a blog Thursday titled "Where the goblins came from."

"Model behavior is shaped by many small incentives," the post read. "In this case, one of those incentives came from training the model for the personality customization feature⁠, in particular the Nerdy personality. We unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread."

Article continues below

While it was meant to stay a small quirk of Codex's "personality," which I suppose aimed to have it talk like that archetypal nerdy guy we all know who's constantly comparing things to pigeons and ogres, the blog notes "reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them." In other words, GPT conversations even without the nerdy personality had been infected with goblin talk.

The blog reckons the goblins are "a powerful example of how reward signals can shape model behavior in unexpected ways," and offers a command to lift the anti-goblin restriction if you like the quirk. If you're interested in learning about other AI aberrations, you can read about how ChatGPT will hype up gastrointestinal distress as "lo-fi" with a "DIY texture" or how California teenager Sam Nelson went to ChatGPT for drug advice and later died from an overdose.

Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.