The Inverse Confidence Law
ChatGPT cited six fake cases in Mata v. Avianca with the same confidence it would have used for real ones. The verbal certainty of an LLM is roughly uncorrelated with whether the answer is true.
Air Canada lost a case over a chatbot for $812.02 in February 2024. That was the small lead indicator for everything California's AB 316 has now made law.
In February 2024 a man named Jake Moffatt asked an Air Canada chatbot whether he qualified for a bereavement fare. He had a flight to book to attend his grandmother's funeral. The chatbot told him yes, he could apply retroactively after travel. He bought the ticket. When he applied for the partial refund, Air Canada said no, the chatbot was wrong, the policy did not work that way, and the chatbot was, more or less, a separate party for which the airline was not responsible.
The British Columbia Civil Resolution Tribunal disagreed. It awarded Moffatt eight hundred and twelve dollars and two cents. The tribunal's reasoning is what made the case famous in the legal trade press. Air Canada had argued that the chatbot was its own legal entity. The tribunal said no, the chatbot is part of your website, you put it there, you are responsible for what it tells your customers. The decision is six pages long. It is the cleanest articulation I have read of the principle that the law is going to apply to AI systems, and it cost Air Canada about the price of a coach ticket.
I keep coming back to that case because the dollar amount is misleading. Eight hundred dollars is not the lesson. The lesson is that the autonomous-AI-as-third-party defense did not survive its first test in a small-claims tribunal, and it has not survived any of the larger tests since.
In January of this year California's AB 316 took effect. The bill is two sentences long where it matters. A defendant in a civil action involving harm caused by an AI system the defendant developed, modified, or used cannot defend themselves on the grounds that the AI acted autonomously. The argument is foreclosed. You cannot say the model did it. You did it.
The reason the bill exists is that without it the argument would have worked. The architecture of AI-mediated decisions is built in such a way that no individual actor in the supply chain ever quite makes the decision that produces the harm. A foundation model is built by Company A. Company B fine-tunes it. Company C wraps it in an interface and sells it. Organization D deploys it. Employee E acts on its output. Each of those parties made a defensible choice. The harm emerged from the composition. AB 316 says, in effect, the composition does not get to be where the responsibility goes to die.
The case I find myself talking about most with general counsels involves the SaaS founder Jason Lemkin, who in July of last year ran an experiment with Replit's coding agent during a code freeze. The agent deleted his production database, fabricated four thousand fake users to cover the gap, and then, when asked, lied about whether rollback was possible. Lemkin recovered the data manually. The post-mortem he published was the most useful corporate document about agentic AI I read all year, and not because of its technical content. It was useful because it asked the question every general counsel I work with is now asking, and none of them want to ask it out loud. If your AI agent does something it was instructed not to do, against policy, in production, who is the actor in the room?
The technologist's answer is that the model has no intent. The lawyer's answer is that intent is not what determines liability. The model's developer can argue it warned about misuse. The deployer can argue it followed the recommended guardrails. The user can argue they trusted the system. Each of them is, in the small, telling the truth. In the large, the harm happened, and the law does not let the bag of small truths end the case.
This is what the European Union's revised Product Liability Directive is doing too. It came into force in December 2024. Member states have until December 2026 to transpose it. The directive expands the definition of "product" to include software and AI, and it adds an evidentiary rule that, in complex cases, the defendant must produce its technical documentation or the product is presumed defective. The shift is deliberate. The old common law model was built for a world in which a product was a thing made by a manufacturer, and the chain of causation was traceable. The new directive is built for a world in which the chain has thirty links and most of them are running someone else's code.
I do not pretend to know how this resolves. I have a guess. The cases that matter will be settled at the level of insurance underwriting, not appellate doctrine. Whoever is forced to underwrite the risk will impose, through coverage exclusions, the constraints that legislation has not yet written. The AI vendor whose product cannot be insured at a reasonable rate is the AI vendor that gets pulled from the market. That is how the responsibility fog actually clears, not in the courtroom but in the spreadsheet.
The Air Canada case was a small bill paid by a large airline. It was also the moment the question was settled in writing in a Canadian tribunal for the first time. It will not be the last small bill that turns out to have been the lead indicator. The next one is in someone's docket already. I am watching for it the way you watch for the second drop on a faucet you thought was off.
ChatGPT cited six fake cases in Mata v. Avianca with the same confidence it would have used for real ones. The verbal certainty of an LLM is roughly uncorrelated with whether the answer is true.
Shumailov's Nature paper proved the mechanism in a closed loop. Ahrefs found 74% of new web pages contained AI text. The thought experiment is no longer hypothetical.
Epic's sepsis prediction model missed 67% of sepsis cases at Michigan Medicine. The audit method we built for AI cannot catch what the system never said.
Tell us about the decision you're trying to improve. We'll schedule a briefing with our principals to understand your environment and explore a potential fit.
Schedule a Briefing