AI at the International Mathematical Olympiad: How AlphaProof and AlphaGeometry 2 Achieved Silver-Medal Standard

Mathematical reasoning plays a crucial role in human cognitive abilities, driving advancements in scientific discoveries and technological innovations. As we aim to develop artificial general intelligence that can match human cognition, it is essential to equip AI systems with advanced mathematical reasoning capabilities. While current AI systems can handle basic math problems, they struggle with the complex reasoning required for advanced mathematical disciplines like algebra and geometry. However, recent developments by Google DeepMind have shown promising progress in enhancing AI systems’ mathematical reasoning abilities, particularly at the International Mathematical Olympiad (IMO) 2024.

The International Mathematical Olympiad, established in 1959, is the oldest and most prestigious mathematics competition that challenges high school students worldwide with problems in algebra, combinatorics, geometry, and number theory. Each year, teams of young mathematicians compete to solve six highly challenging problems. In 2024, Google DeepMind introduced two AI systems, AlphaProof and AlphaGeometry 2, to participate in the IMO. These systems managed to solve four out of six problems, performing at the level of a silver medalist.

AlphaProof is an AI system designed to prove mathematical statements using the formal language Lean. It combines the Gemini pre-trained language model with AlphaZero, a reinforcement learning algorithm known for mastering games like chess, shogi, and Go. The Gemini model translates natural language problem statements into formal ones, creating a library of problems with varying difficulty levels. AlphaProof then generates potential solutions and searches for proof steps in Lean to verify or disprove them. This neuro-symbolic approach allows the system to translate natural language instructions into formal language to prove or disprove mathematical statements.

On the other hand, AlphaGeometry 2 is an advanced version of the AlphaGeometry series designed to solve geometric problems with enhanced precision and efficiency. It employs a neuro-symbolic approach that combines neural large language models (LLMs) with symbolic AI. The LLM predicts new geometric constructs, while the symbolic AI applies formal logic to generate proofs. AlphaGeometry 2 integrates the Gemini LLM and features a symbolic engine that operates faster, enabling it to explore alternative solutions quickly. This system sets a new standard in solving intricate geometric problems.

At the IMO 2024, AlphaProof tackled two algebra problems and one number theory problem, while AlphaGeometry 2 successfully solved the geometry problem. Both systems earned 28 points, achieving perfect scores on the problems they solved, placing them at the high end of the silver-medal category. Despite their impressive performance, these AI systems still rely on human experts to translate mathematical problems into formal language. Future research aims to develop natural language reasoning systems that can enhance problem-solving capabilities without requiring formal language translation.

The performance of AlphaProof and AlphaGeometry 2 at the IMO highlights significant advancements in AI’s ability to tackle complex mathematical reasoning. These systems have demonstrated silver-medal-level performance by solving four challenging problems out of six. While they still face challenges in integration with other AI systems, ongoing research aims to further enhance their capabilities and potentially integrate natural language reasoning to tackle a broader range of mathematical challenges.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top