Weaponized Miscommunication

OpenAI’s unit distance proof is real. The language wrapped around it bent, word by word, in exactly the direction the incentives were pulling.

May 25, 2026

The proof is real. I want that fixed in place before anything else, because everything that follows is about language, not mathematics, and the two should never be confused. An internal OpenAI model produced a disproof of Erdős’s planar unit distance conjecture. The math was checked line by line by serious people, a Princeton mathematician has already pinned the exponent, and the result is genuinely beautiful. Nothing below touches any of that.

What follows is about the sentences wrapped around it. A correct proof and an accurate account of who produced it and how are they two different artifacts, and only one of them was audited. The proof went through the most demanding verification the field has. The story about the proof went through a communications process. This is a piece about the second artifact, and about a specific thing language can do when an incentive is leaning on it.

Claim jumping

The move at the center of this is older than AI. Set two claims side by side, one certain and one not, phrase them as a single thing, and the certainty of the first bleeds into the second. The reader’s confidence is earned by the part that is true and then spent on the part that is not. Call it claim jumping: an unverified assertion rides across a gap on the back of a verified one and arrives looking like it walked.

Here the two claims separate cleanly once you look. One, the proof is correct. Two, the model produced it autonomously. The first is settled, by external mathematicians, to the highest standard the discipline has. The second is a claim about provenance, about where the proof came from and how much of it the machine did unaided, and provenance cannot be read off a finished proof. You check a proof by checking its steps. You cannot check autonomy by admiring the result, any more than you can confirm who cooked a meal by tasting it. The two claims need two different audits. Only one of them got an audit at all.

One result, two vocabularies

The most telling evidence is that OpenAI used different words in different rooms.

In the technical paper, the document mathematicians actually read, the careful sentence is that the problem was “solved in a completely automated fashion.” Automated is a process word. It claims the pipeline ran without a human in the loop, which is checkable, modest, and probably true, a statement about plumbing. The same paper calls the result, once, the “autonomously produced solution,” and otherwise stays in the cautious register.

In the announcement and the post on X, the rooms where the public and the press and the investors are standing, the word changes. There it becomes “the first time AI has autonomously solved a prominent open problem,” and “an AI system has autonomously resolved a longstanding open problem.” Autonomous is not a process word. It is an agency word. It does not say the pipeline ran by itself, it says the mind worked by itself. And it is the word that converts an impressive result into a historic one, because the milestone on offer is not “a proof was produced” but “a proof was produced by a machine thinking on its own.”

Watch which word went to which audience. The hedged, defensible term sits in the document that the few people equipped to challenge it will read. The strong, agency-laden term sits in the channels that reach the many who will not. That is not how a word drifts by accident. Accident scatters. This is sorted.

The words that flatter the first guess

Two words in the announcement do quiet work, and both lean the same way.

Autonomous we have covered: the reader hears a machine doing what a mathematician does, facing an open question, deciding what to ask, and answering it.

General-purpose is the subtler one. The announcement stresses that the proof came from a general-purpose model, “rather than from a system trained specifically for mathematics.” To a reader that lands as humility. No special advantage, no math machine, just a generalist that turned out to be brilliant. But a frontier general model is not a generalist in that sense. It is an expert in every subfield at once, including the exact one that cracked this: the algebraic number theory, the class field towers, the Golod-Shafarevich machinery. “Not trained specifically for math” does not mean it lacked the relevant expertise. It means it held the relevant expertise and everything else besides. The part the mathematicians found most striking, the jump from plane geometry into number theory, is a jump only for a human, who lives inside one specialty. For a system that holds every specialty at equal distance, nothing is cross-domain. The phrase general-purpose invites you to marvel at the very thing that made the result more available to it, not less.

Both words steer the reader toward the more impressive of two readings while the technical reality sits in the other one. That is the signature. Honest framing leaves the reader’s first guess to chance. This framing loads it.

Where the people went

Look at the grammar of “solved in a completely automated fashion.” It is a sentence with no people in it. A problem was solved. By a fashion. The humans who built the system, chose this problem out of a collection, wrote the instruction that produced the model’s prompt, ran the thing some number of times, and selected the run that worked, have all been lifted out of the sentence by the passive voice and a noun. Agentless constructions are how a claim sheds its qualifiers. You cannot ask “who decided” of a sentence that has been arranged to have no one deciding.

And the qualifier does exist. It is just kept in a separate room from the headline.

The caveat at the bottom

The same announcement that opens with a machine autonomously resolving an open problem closes, several hundred words later, by scoping the machine to three verbs, “search, suggest, and verify,” and reserving the rest to people: “People choose the problems that matter,” interpret the results, and decide what to pursue next.

Read those against the headline. The top of the page gives the machine the whole act. The bottom gives it three verbs and hands the rest, choosing the problem, interpreting the result, setting the direction, back to humans. That is the company conceding the exact distinction the headline erased, in the same document, in language calm enough that almost no one notices it cancels the lede. The strong claim sits at the top, where it sets the tone and the press pulls its quote. The correction sits at the bottom, in the part everyone scrolls past.

“In this case”

There is one more place the careful language and the marketed language part, and this one is quantitative. The announcement says the model was evaluated “on a collection of Erdős problems” and that “in this case it produced a proof.” Two phrases easy to slide over, that together mean: it was run on many problems, and this is the one that worked. Then, in a chart most readers will not stop to parse, the company reports investigating the success rate at varying amounts of test-time compute, with the per-attempt accuracy climbing only to around one half at the high end.

So the honest description of the event is closer to this: a system that succeeds on this problem some fraction of the time, given a great deal of compute, was run against a batch of problems chosen by people, and on one of them it produced a proof that turned out to be correct. “Autonomously solved a prominent open problem” is that same event with the fraction, the batch, and the people deleted. Nothing in the short version is false. Everything that would let you size the claim is gone.

Mistake, drift, or design

None of this proves intent, and I will not pretend it does. There is an honest reading in which every step is good faith. Researchers who have just watched their system do something remarkable are excited, and excitement reaches for the bigger word. Automated and autonomous are near synonyms in ordinary speech, and the slide between them is the kind of thing that happens when smart people write fast about work they are proud of. The paper itself is reasonably forthcoming. It publishes the prompt, the model’s raw output, the success-rate chart, and the phrase “collection of problems.” A company setting out to deceive does not usually hand you the evidence to catch it. On the good-faith reading, the technical document is honest and the announcement simply ran hot.

But you cannot judge the language without the slope it sits on, and the slope is not level. Three things press on every word.

The first is that capability is the product. For a company at OpenAI’s valuation, a demonstration that the model reasons autonomously at the frontier of human knowledge is not a nice result, it is the thesis of the business, the thing the next round and the next enterprise contract are priced against. The word autonomous is worth money in a way the word automated is not.

The second is the race. OpenAI is not announcing into a vacuum. It is announcing against competitors making their own claims about machine reasoning, and “first time an AI has autonomously solved a prominent open problem” is a flag planted to be first. Being first is a linguistic achievement here as much as a technical one, because the superlative lives in the framing, not in the theorem.

The third is the record. This is not OpenAI’s first Erdős announcement. In October 2025 the company said its model had solved ten previously unsolved Erdős problems. It had not. The model had surfaced solutions that already existed in the literature, and the claim was walked back after a mathematician called the framing a serious misrepresentation. To OpenAI’s real credit, that same critic is among those who verified the unit distance proof, which is exactly why correctness this time was nailed down so hard. But look at what got reinforced and what did not. The thing that embarrassed them last time was correctness, so this time correctness was over-built, externally checked, bulletproof. The thing that makes them famous is autonomy, and autonomy was left exactly as unaudited as it was before. You reinforce what you are afraid of. You leave alone what you are selling.

Weaponized miscommunication

Put it together and the useful description is not “they lied.” It is that a vocabulary was placed under load and bent, every time, in the direction the load was pulling. The cautious word went to the careful readers and the strong word went to the crowd. The flattering reading was loaded and the deflating one was left to the small print. The human contribution was deleted by grammar and then quietly restored at the bottom, where it could not hurt the headline. The number that would size the claim was published in a form almost no one can read. No single one of these is a lie. Together they are a machine for producing a belief the evidence does not support, and the belief they produce is, every single time, the one worth the most money.

That is what makes it potentially weaponized rather than merely sloppy. Sloppiness is random, and its errors point every direction. This points one direction. When every ambiguity resolves in favor of the same party, ambiguity is not the explanation. You do not need a liar for this. You need an incentive and a language soft enough to deform under it, and both were present in abundance.

The remedy is the same small thing it has been all along, and the smallness is the tell. Publish the human side: the instruction that generated the prompt, the count of runs, the protocol for choosing which problem and which output to show. One short document would turn “autonomously resolved an open problem” into a claim anyone could check. They released the proof, the prompt, the chain of thought, and a chart, and not the single page that would let you weigh the word the entire milestone rests on.

The math is honest. It earned its certainty the hard way, in front of the harshest readers in the field. The language around it was written to borrow that certainty for a claim that never went through the same door. So watch the words, not just the theorem. The theorem cannot lie to you. The sentences can, and these were built to.

shane berarducci

Discussion about this post

Ready for more?