The Oracle's Dilemma

When Croesus, king of Lydia, asked the Oracle at Delphi whether he should attack Persia, the Pythia replied that if he did, a great empire would be destroyed. Croesus attacked. The empire that was destroyed was his own.

The Oracle had not lied. Every word of the prophecy was accurate. The failure lay in the form of the answer, in the form's invitation to read it as endorsement. The Oracle's output was factually correct and structurally useless. It contained truth without the conditions under which the truth applied. The context, the qualifications, the categorical boundaries that would have made the truth actionable were absent. Their absence was a feature of the medium, not an accident.

We have built a new generation of oracles, and their output has the same property.

I have been thinking about this since June 2023, when a federal judge in Manhattan published a sanctions order against two lawyers who had cited six fake cases generated by ChatGPT. I have written about Mata v. Avianca before, in another piece. What I want to say here about it is different. The lawyers' failure was not a failure of due diligence. It was a failure of interpretation. The model had given them a true-shaped object. The shape was indistinguishable from the shape of an actual citation. The lawyers had no Greek interpretive tradition for reading a true-shaped object that was hollow inside.

The Greeks had the tradition. They had centuries of practice with it.

The Oracle at Delphi was not a simple information service. It was an architectural system with very specific properties. The Pythia sat over a fissure in the earth, inhaled volcanic gases, and spoke in fragments. Temple priests interpreted the fragments and composed them into the hexameter verse delivered to the petitioner. The petitioner then interpreted the verse in the context of their own situation. At each transformation, context was added and context was lost. The priests had no knowledge of the petitioner's military situation. The petitioner had no knowledge of the Pythia's unedited utterance. The system was specifically designed so that no single participant held both the raw signal and the applied context, and the petitioner was on warning that this was the case.

The Greeks developed an elaborate set of practices for reading oracular pronouncements. Multiple interpretations had to be considered. The most flattering reading had to be examined for self-deception. Oracular responses were often consulted in the form of two questions, with the second designed to test the first. The petitioner brought a sceptic. They brought a record of past prophecies and how those had read in retrospect. The interpretive tradition was, by the time of Herodotus, a recognizable craft. It had taken centuries to develop, and it acknowledged in its very existence that the Oracle's truth and useful truth were different things, and that the gap was the petitioner's responsibility to bridge.

We have not developed an equivalent tradition for AI systems. We treat AI output as information to be verified, not as oracular utterance to be interpreted. We check whether it is true. We do not ask whether its truth, in the form given, supports the judgment we need to make. We have verification workflows. We do not have interpretive practices.

There is an important difference between these two stances.

Verification asks whether the system got the facts right. It is a binary check. Pass or fail. Resolution per fact.

Interpretation asks something else. What kind of object is this output? What cognitive schema does its form activate in the reader? What does the form imply about its own conditions of applicability? What does the output leave out, and what does its leaving-out imply about the reader's responsibility? Interpretation is a habit of mind, not a workflow.

Verification is a check on content. The Oracle's dilemma is a problem of form. You can verify that "history of cardiac events" is factually accurate and still miss that this particular truth, in this particular phrasing, will produce a clinical decision the full record does not support. You can verify that a generated case citation matches the format of a real citation and still miss that the case is fabricated. Accuracy and decision quality are orthogonal dimensions, and measuring one tells you nothing about the other.

What the Greeks understood about oracular systems we have not yet learned about AI ones. The output is a thing in itself, distinct from the truth it may or may not contain. The form has been engineered, by the training process, to read like an answer. It will read like one whether or not it is one. The question of whether to act on it requires reading the form, not just verifying the content.

I think the interpretive tradition, when we develop it, will look something like this. A discipline of asking, before acting on AI output, what shape the output has taken and why. Whether the question I asked was structured in a way the system was likely to confirm. Whether the absence of a qualification is a real absence or a missing qualification. Whether the form's confidence is doing argumentative work the content has not earned. Whether the citation is a citation or a citation-shaped object.

Croesus asked the right question. He received a true answer. He made a catastrophic decision. The Oracle was not stupid. Croesus was not stupid. The system was the failure, the architecture of transmission between truth and judgment. We are rebuilding that architecture at scale and we have not yet asked, with any seriousness, whether we have solved the problem the Greeks identified three thousand years ago.

We have not.

The Oracle's Dilemma

Further Reading

Related to Artificial Intelligence

The Ancestor's Error

The Taxonomy of Silence

The Fidelity Trap

Initiate Contact

Ready to transform your
decision architecture?

The Oracle's Dilemma

Further Reading

Related to Artificial Intelligence

The Ancestor's Error

The Taxonomy of Silence

The Fidelity Trap

Initiate Contact

Ready to transform your decision architecture?

Ready to transform your
decision architecture?