The Inverse Confidence Law
ChatGPT cited six fake cases in Mata v. Avianca with the same confidence it would have used for real ones. The verbal certainty of an LLM is roughly uncorrelated with whether the answer is true.
Analysis and argument on AI decision-making, institutional risk, and the gap between what systems promise and what they actually do.
ChatGPT cited six fake cases in Mata v. Avianca with the same confidence it would have used for real ones. The verbal certainty of an LLM is roughly uncorrelated with whether the answer is true.
Shumailov's Nature paper proved the mechanism in a closed loop. Ahrefs found 74% of new web pages contained AI text. The thought experiment is no longer hypothetical.
Air Canada lost a case over a chatbot for $812.02 in February 2024. That was the small lead indicator for everything California's AB 316 has now made law.
A 2025 CHI paper showed that human confidence aligns with AI confidence, and the alignment outlasts the tool. The error rate stays. The calibration moves.
In February 2024 a Hong Kong firm wired $25 million to a synthetic CFO over a deepfake video call. Every protective layer in the firm's controls had quietly become invalid before the call rang.
Epic's sepsis prediction model missed 67% of sepsis cases at Michigan Medicine. The audit method we built for AI cannot catch what the system never said.
Ted Chiang gave a metaphor in The New Yorker in 2024. I keep finding new ways to test it on the people I work with, and I keep failing to find a counterargument.
Air France 447 and a 2025 Polish endoscopy trial point at the same trap. The more reliable the system, the more thoroughly its absence becomes catastrophic.
James Scott's argument from 1998, run at the speed of inference. The map is quietly rebuilding the territory inside every firm that runs an AI summarization layer.
Tell us about the decision you're trying to improve. We'll schedule a briefing with our principals to understand your environment and explore a potential fit.
Schedule a Briefing