LONDON — In a remarkable stride for artificial intelligence, Google DeepMind, led by cutting-edge developments from its team, has introduced AlphaProof and AlphaGeometry 2, two advanced models aimed at tackling some of the most challenging reasoning problems in mathematics. With aspirations toward artificial general intelligence (AGI), these models have demonstrated exceptional capabilities, rivaling the skills of human competitors in prestigious mathematical contexts.
This year, AlphaProof and AlphaGeometry 2 tackled problems from the International Mathematical Olympiad (IMO), achieving a total score that matched the silver-medal thresholdāa historic feat for AI. The IMO, known for its rigorous problems in algebra, combinatorics, geometry, and number theory, has been a benchmark of mathematical brilliance since 1959 and is increasingly recognized as a formidable challenge in AI and machine learning.
The solutions produced by these models were evaluated by respected mathematicians, including Professor Sir Timothy Gowers, an IMO gold medalist and Fields Medal laureate, and Dr. Joseph Myers, an esteemed figure in the mathematical community and IMO committee leader. Professor Gowers remarked on the sophistication of the models, commending their capability to generate solutions that push the boundaries of AIās mathematical understanding.
At the competition, the problems were initially formalized into structured mathematical language, allowing AlphaProof and AlphaGeometry 2 to interpret and solve them. AlphaProof successfully solved two algebra problems and a complex number theory challenge that had stumped many human contestants. AlphaGeometry 2, meanwhile, solved an intricate geometry problem, using its newly enhanced speed and knowledge-sharing capabilities to reach solutions with efficiency, even tackling a complex problem in under 20 seconds. This silver-medal level achievement places the AI at a level near the highest accolades awarded to human competitors.
AlphaProof, leveraging the structured language Lean and the self-learning AlphaZero algorithm, is a unique system designed to build and verify formal proofs with accuracy. The model is also trained to translate natural language math problems into formal statements, creating a vast library of problems that span a range of complexities. By iteratively generating and verifying proof candidates, AlphaProof has honed its ability to address even the most demanding questions.
On the geometry front, AlphaGeometry 2 is a neuro-symbolic hybrid trained extensively on synthetic data, allowing it to outperform its predecessor in speed and accuracy. Its advanced symbolic engine and adaptive search mechanisms allow AlphaGeometry 2 to solve historical IMO geometry problems with an 83% success rateāa substantial improvement over the previous model.
In an experimental effort, Google DeepMind also tested natural language reasoning capabilities with its Gemini system, revealing promise in AIās potential to interpret and solve mathematical challenges without formal language translation.
With continued efforts in developing AI-driven mathematical reasoning tools, DeepMind envisions a future where AI can assist mathematicians by exploring complex hypotheses, proposing innovative solutions, and automating rigorous proof processes.