Paper Analysis: Gemini Contributes Nontrivial Mathematical Insight

This post analyzes a recent pure mathematics work in the field of algebraic geometry (arXiv:2601.07222). This work was completed jointly by mathematicians and Gemini.

This post will analyze:

The specific role AI played in this work
Based on this, assess AI’s capabilities in mathematical proof
Predict AI’s development trajectory in mathematical proof, along with technical analysis
Suggestions for mathematicians: when writing mathematical works in collaboration with AI, what information should be included to help the community better understand AI’s capabilities in mathematics

I am a scientist working on AI agents and reasoning at a tech company. Before working in AI, I studied mathematics, and the subject happened to be exactly the field of this work (algebraic geometry).

Conclusion First

I believe this work demonstrates that:

The SOTA autoregressive transformer of January 2026, combined with agent systems, can produce mathematician-level insights on previously unseen problems.

Prediction: In 2026 (most likely the first half), SOTA autoregressive transformers, combined with agent systems, will be able to independently complete highly nontrivial pure math work (perhaps still requiring mathematicians to verify the correctness of the final proof).

Background: Enumerative Geometry

First, let me introduce the field of this paper: enumerative geometry. This is the purest of pure math, one of the mainstreams in pure mathematics. To understand intuitively, the questions this field studies include: how many intersection points does one geometric object (such as a curve) have with another geometric object (such as a surface). This field has profound connections with branches of theoretical physics (such as string theory).

The last author of this work is Ravi Vakil. He is a full professor in Stanford’s mathematics department and president of the American Mathematical Society—a giant in enumerative geometry. His introductory algebraic geometry text The Rising Sea is extremely popular in this field; I spent nearly a year seriously studying it. It’s really well written (much better than Hartshorne). As for Vakil’s specific mathematical contributions, I’m not qualified to comment. But he’s definitely in the top 0.1% of the current mathematician community.

How AI Participated in the Mathematical Proof

The main theorem proved in this paper is Theorem 1.1:

Theorem 1.1. Suppose \(\beta = (d_1, \ldots, d_n)\) is strictly monotonic. Then we have the following equality in \(K_0(\text{Var}_{\mathbb{k}})\), the Grothendieck group of varieties over \(\mathbb{k}\):
\[[\Omega^2_{d_n,\ldots,d_1}(\text{Fl}_{n+1})] = \left[ \text{GL}_n \times \mathbb{A}^{D-n^2} \right]\]
where \(D = \sum_{k=1}^{n} 2d_k\).

The mathematicians state in the paper that they have not seen anyone research this theorem before. The way AI proved this theorem differs from proofs of similar existing theorems. The authors of the paper writes:

So, absent some future discovery to the contrary, the model’s contribution appears to involve a genuine combination of synthesis, retrieval, generalization and innovation of these existing techniques.

Explaining the Theorem in Plain Language

The main object this paper studies is \(\Omega^2_{d_n,\ldots,d_1}(\text{Fl}_{n+1})\). Simply put, this is a complex geometric object.

Given \((d_1,\ldots,d_n)\), \(\Omega^2_{d_n,\ldots,d_1}(\text{Fl}_{n+1})\) is a complex geometric object. When \(n\) and \(d_1,\ldots,d_n\) are all small (for example \(n=2, d_1=2, d_2=3\)), this geometric object is relatively simple. As \(n\) gets larger and \(d_1,\ldots,d_n\) get larger, this geometric object becomes increasingly complex.

In contrast, \(GL_n \times \mathbb{A}^{D-n^2}\) is a very simple geometric object.

\([\Omega^2_{d_n,\ldots,d_1}(\text{Fl}_{n+1})]\) refers to a special algebraic invariant of the object \(\Omega^2_{d_n,\ldots,d_1}(\text{Fl}_{n+1})\). An algebraic invariant can be understood as a geometric feature of this object that can be expressed numerically—for example, the number of holes in a geometric object is an algebraic invariant. The dimension of a geometric object is also an algebraic invariant.

So the theorem states that a certain algebraic invariant of a complex geometric object, \(\Omega^2_{d_n,\ldots,d_1}(\text{Fl}_{n+1})\), equals the same algebraic invariant of a very simple geometric object, \(GL_n \times \mathbb{A}^{D-n^2}\). This equality holds for any strictly increasing sequence \((d_1,\ldots,d_n)\).

This is a very typical enumerative geometry theorem: for a complex geometric object, we can compute a certain algebraic invariant, thereby gaining deeper understanding of its geometric features and thus better understanding the complex geometric object itself.

One more note: don’t think “algebraic invariants” and “complex geometric objects” have nothing to do with reality. Under certain conditions, these geometric objects and invariants have very concrete physical meaning. For example, Yang-Mills theory and string theory can and must be described using the language of geometric objects and geometric invariants.

Enumerative geometry and algebraic geometry are so profound that I once seriously considered whether to spend my entire career researching them.

The Proof Process

The AI tools that participated in the proof include: Gemini 2.5 Deep Thinking and an unpublished Gemini-based tool called FullProof (likely a Gemini-based agent). The paper does not specify exactly how these two tools respectively contributed.

Proof process:

The mathematicians proposed a conjecture: they first used algorithms to compute many special cases (for relatively small \(n\) and \(d_1,\ldots,d_n\)). They conjectured that Corollary A.1, a slightly weaker statement than Theorem 1.1, holds.
The mathematicians first asked AI to prove whether Corollary A.1 holds when \(n\) and \(d_1,\ldots,d_n\) are relatively small (when the geometric object is relatively simple).
The AI system successfully proved these simple cases.
In AI’s proof, there was a key step containing important insight. Vakil stated that this key step is profound, nontrivial, and he would be proud if he had proposed it. The mathematicians carefully analyzed this key step. Based on it, they proposed a new proposition (which the mathematicians did not prove themselves), along with an approach for using this proposition to prove the main theorem.
They put these into the prompt and asked AI to prove Corollary A.1. After checking AI’s proof, the mathematicians concluded that AI successfully proved Corollary A.1.
The mathematicians then asked AI to prove Theorem 1.1 based on everything above. After checking AI’s proof, the mathematicians concluded that AI successfully proved Theorem 1.1.

Throughout this process, I’m unclear how much the mathematicians’ contribution—the new proposition they proposed based on AI’s key insight, and the approach to proving Corollary A.1—accounts for in the overall proof. I hope enumerative geometry experts can clarify this.

Assessing AI’s Capabilities in Mathematical Proof

Based on the above, we can conclude that:

Commercial large language models available to the public in 2025, combined with agent systems, can in many mathematical fields:

Propose new, substantial, non-trivial mathematical insights
Sufficiently understand cutting-edge pure math research. This is evidenced by the fact that in this paper, the model can write the correct proofs of non-trivial mathematical propositions when prompted with a general approach and related information.

Predicting AI’s Development in Mathematical Proof

In February 2025, we believed SOTA models were essentially producing nonsense on pure math problems. For basic graduate-level math exercises, models would still make the most basic errors (like easy counting mistakes).

By December 2025, SOTA models can already propose non-trivial mathematical insights in mainstream cutting-edge mathematical fields, and given a general approach, can write complete correct proofs of non-trivial mathematical propositions—meaning they have nontrivial understanding for frontier mathematics.

I believe that SOTA models’ mathematical capabilities are far from reaching a bottleneck.

On one hand, mainstream data curation and the SFT+RL post-training paradigm still have much room for exploration. Especially in data curation: since current SOTA models already have the ability to understand cutting-edge mathematics, we can use SOTA models to construct higher quality data based on existing mathematical data. Models can also further improve themselves by letting themselves check their own proof/proposed solution. Remember this paper proves that model can make nontrivial attempt in proving, and it isn’t a far fetched hypothesis that checking once own proof is an easier task than producing a good proof.

On the other hand, formal languages like Lean have most likely not yet been truly applied to SOTA model training.

Therefore, I believe that models’ mathematical capabilities will continue to progress in 2026 at a speed faster than the progress from February 2025 to December 2025.

Suggestions for Mathematicians

When writing mathematical works in collaboration with AI, what information should be included to help the community better understand AI’s capabilities in mathematics?

This paper does not explain:

Can the model realize after many attempts that its initially proposed key step is crucial?
If prompted with the initially proposed key step, can the model through extensive experimentation find the new key proposition and the approach that mathematicians proposed after analysis?
Can the model judge whether its own proof is correct without help from mathematicians?
How many attempts did the model make to achieve each critical steps in the paper?

I hope future papers on how AI helps prove math theorems will include this information. This will help us better assess AI models’ capabilities.