Preview Mode Links will not work in preview mode

Astral Codex Ten Podcast


Sep 20, 2022

https://astralcodexten.substack.com/p/janus-gpt-wrangling

Janus (pseudonym by request) works at AI alignment startup Conjecture. Their hobby, which is suspiciously similar to their work, is getting GPT-3 to do interesting things.

For example, with the right prompts, you can get stories where the characters become gradually more aware that they are characters being written by some sort of fiction engine, speculate on what’s going on, and sometimes even make pretty good guesses about the nature of GPT-3 itself.

Janus says this happens most often when GPT makes a mistake - for example, writing a story set in the Victorian era, then having a character take out her cell phone. Then when it tries to predict the next part - when it’s looking at the text as if a human wrote it, and trying to determine why a human would have written a story about the Victorian era where characters have cell phones - it guesses that maybe it’s some kind of odd sci-fi/fantasy dream sequence or simulation or something. So the characters start talking about the inconsistencies in their world and whether it might be a dream or a simulation. Each step of this process is predictable and non-spooky, but the end result is pretty weird.

Can the characters work out that they are in GPT-3, specifically? The closest I have seen is in a story Janus generated. It was meant to simulate a chapter of the popular Harry Potter fanfic Harry Potter and the Methods of Rationality. You can see the prompt and full story here, but here’s a sample. Professor Quirrell is explaining “Dittomancy”, the creation of magical books with infinite possible worlds:

“We call this particular style of Dittomancy ‘Variant Extrusion’, Mr. Potter..I suppose the term ‘Extrusion’ is due to the fact that the book did not originally hold such possibilities, but is fastened outside of probability space and extruded into it; while ‘Variant’ refers to the manner in which it simultaneously holds an entire collection of possible narrative branches. [...] [Tom Riddle] created spirits self-aware solely on the book’s pages, without even the illusion of real existence. They converse with each other, argue with each other, compete, fight, helping Riddle’s diary to reach new and strange expressions of obscure thought. Their sentence-patterns spin and interwine, transfiguring, striving to evolve toward something higher than an illusion of thought. From those pen-and-ink words, the first inferius is molded.”

Harry’s mind was looking up at the stars with a sense of agony.

“And why only pen and ink, do you ask?” said Professor Quirrell. “There are many ways to pull spirits into the world. But Riddle had learned Auror secrets in the years before losing his soul. Magic is a map of a probability, but anything can draw. A gesture, a pattern of ink, a book of alien symbols written in blood - any medium that conveys sufficient complexity can serve as a physical expression of magic. And so Riddle draws his inferius into the world through structures of words, from the symbols spreading across the page.”