For years, we’ve been told that Large Language Models can’t actually reason—they just autocomplete plausible-sounding nonsense using vector math.
That argument just died.
In a stunning development this week, a collaborative AI system autonomously solved ErdÅ‘s Problem #397, a decades-old number theory conjecture posed by the legendary Paul ErdÅ‘s. This wasn’t a lucky guess. The proof was generated by GPT-5.2 Pro, formally verified by a specialized agent named Aristotle, and—critically—accepted by Fields Medalist Terence Tao.
This breaks the “hallucination barrier.” We are witnessing the shift from AI that approximates truth to AI that proves it.
The Winning Team: GPT-5.2 + Aristotle

What makes this breakthrough fascinating is the architecture. It wasn’t a single “God Model” that did the work. It was a neuro-symbolic relay race.
| Agent | Role | The “Human” Equivalent |
|---|---|---|
| GPT-5.2 Pro | The Intuition Engine. It proposed a radical new family of counter-examples to the conjecture, refuting the original premise. | The visionary professor who sketches a brilliant idea on a napkin. |
| Aristotle | The Verifier. Built by Harmonic (co-founded by Robinhood’s Vlad Tenev), this agent took the raw intuition and translated it into Lean, a formal proof language. | The meticulous grad student who checks every line of logic for months. |
The Problem Itself (#397)

Without getting too deep into the weeds, Erdős #397 asks about the density of solutions for a specific equation involving central binomial coefficients. It’s the kind of problem that requires exploring an infinite search space—something human intuition struggles with, but which AI, when properly guided, devours.
GPT-5.2 didn’t just find a solution; it found a structural flaw in the conjecture itself. Then, Aristotle proved that flaw existed with 100% mathematical certainty.
Why Terence Tao Matters
When Terence Tao—arguably the greatest living mathematician—speaks, the industry listens.
Tao confirmed the proof late yesterday, noting that while these problems are “lowest-hanging fruit” in the grand scheme of mathematics, the workflow is the breakthrough.
> “The AI acted as a perfect toolchain: retrieval, creative rewriting, and formal verification. It didn’t ‘understand’ in the philosophical sense, but it produced a result that is undeniably true.” — Terence Tao
This is the validation we’ve been waiting for. It confirms that Formal Verification (using languages like Lean or Coq) is the missing link that turns “AI reasoning” from a marketing buzzword into an engineering reality.
The Hidden Player: Vlad Tenev’s “Harmonic”
You might be surprised to see Vlad Tenev’s name here. The Robinhood CEO quietly co-founded Harmonic with the specific goal of “mathematical superintelligence” (MSI).
While OpenAI and Google chase AGI through massive text transformers, Harmonic bet on Math-First AI. Their agent, Aristotle, is designed to be “hallucination-free” because it doesn’t output text; it outputs verified code. If the code doesn’t compile in the Lean theorem prover, the AI knows it’s wrong and tries again.
This “Self-Correction Loop” is why Aristotle could take GPT-5.2’s messy creative output and refine it into a Fields-Medal-worthy proof.
What This Means For You
This isn’t just about math. This architecture—Intuition (LLM) + Verification (Symbolic Solver)—is the blueprint for reliable agents in every industry.
1. Software Engineering: Instead of just generating code, future agents will mathematically prove that a function cannot crash before you even run it.
2. Finance/Crypto: Smart contracts will be auto-verified against hacks, not just audited by humans.
3. Science: We are seeing early signs of this in physics too, with Google DeepMind using similar techniques to find new singularities in the Navier-Stokes equations (a Millennium Prize problem).
The Bottom Line
We have crossed a threshold. AI is no longer limited to regurgitating the internet. It is now contributing new knowledge to the corpus of human understanding.
The barrier to AGI wasn’t creativity; it was reliability. With the marriage of LLMs and Formal Verification, that barrier is crumbling.
FAQ
Did the AI really “solve” it, or just find it in the training data?
It solved it. Erdős Problem #397 was an open conjecture with no known solution in the training set. The AI generated a novel counter-example.
What is “Lean”?
Lean is a programming language and theorem prover. Unlike Python or C++, code in Lean describes mathematical statements and proofs. If the code compiles, the math is guaranteed to be correct.
Is Harmonic owned by OpenAI?
No. Harmonic is an independent startup co-founded by Vlad Tenev (Robinhood) and Tudor Achim. However, they likely use GPT-4o or GPT-5 class models as the “intuition” layer for their Aristotle agent.
