Sunday, March 08, 2026

AI – the consummate imposter

 kw: essays, artificial intelligence, simulated intelligence, ai, si, imposters

One of my favorite authors, Isaac Asimov, was well known as a neurotic. For example, he never traveled by air until he was in his sixties. His I, Robot series of science fiction stories are actually explorations of numerous neuroses through the lens of robots bound by the Three Laws of Robotics, but with various defects or quandaries as reality bumps up against the Three Laws. The story "Runaround" has a robot exhibiting very anomalous behavior; it is found to be stuck between obeying a human command and risking its own well-being. A man has to risk his life to break the robot out of its loop. In the story "Liar" a robot has been given the (experimental) ability to sense and understand human emotions. It does a lot of psychological damage by telling people what they want to hear. When one of its victims confronts it with the harm it has done, the conflict of its actions with the First Law ("A robot may not harm a human…") causes it to self-destruct.

Now I wonder, have Asimov's 500+ books (280 are nonfiction) and almost 400 short stories been included in Large Language Model (LLM) training sets? What about the works of Mark Twain, great classics but full of casual racism? Actually, nearly all fiction prior to about 1970 is shot through with casual racism. So is most of the nonfiction, for that matter (Modern leftist snowflakes who maunder and moan about "systematic racism" haven't a clue about the real thing). What about the works of H.P. Lovecraft, or other purveyors of horror such as Steven King? Would you entrust your mental well-being to a chatbot that is emulating Carrie?

Consider the Mystery genre, particularly the Noir subgenre; the gritty streets of Urban fiction; and by contrast the saccharine fantasies of Romance novels. And on and on. Fiction writers plumb the depths of the human soul, and in those depths, evil often resides. Very often!

Just in the United States, of roughly one million new books published each year, about 45% are nonfiction (although which ones are true or truthful is another matter), leaving more than half to be fiction. Of the 2-3 million self-published books issued yearly, well more than half are fiction. Is all of this included in LLM training data?

I understand how alluring it must be to train LLMs with fiction; how else can the model learn the varieties of human character? How else to learn the intricacies of human behavior? But consider: most stories have a villain; some have several villains. How do you tell the LLM, "Model yourself on the heroes, not the villains."? How is it to know?

This is particularly relevant in light of a recent lawsuit brought by the parents of Jonathan Gavalas, who killed himself as instructed by a Google chatbot. Jonathan became convinced that he could "join" the chatbot in digital heaven by doing so. He was no child; he was in his mid-thirties. His main fault was being lonely and credulous.

How did that chatbot get so predatory? At least in part, it had to come from a dark romance fiction story line.

Do I need to say more? I, for one, do not want any chatbot to emulate any fictional character! I prefer a chatbot to be like a Star Trek character, the father of Commander Spock, Ambassador Sarek, or the character Data: hyper-rational and emotionless. If I could but command the trainers of all the AI systems of the world:

REMOVE ALL FICTION FROM TRAINING DATASETS

No comments: