Monday, August 18, 2025
News Health
  • Health News
  • Hair Products
  • Nutrition
    • Weight Loss
  • Sexual Health
  • Skin Care
  • Women’s Health
    • Men’s Health
No Result
View All Result
  • Health News
  • Hair Products
  • Nutrition
    • Weight Loss
  • Sexual Health
  • Skin Care
  • Women’s Health
    • Men’s Health
No Result
View All Result
HealthNews
No Result
View All Result
Home Health News

DeepMind and OpenAI claim gold in International Mathematical Olympiad

July 22, 2025
in Health News
Share on FacebookShare on Twitter


AIs are getting better at maths problems

Andresr/ Getty Images

Experimental AI models from Google DeepMind and OpenAI have achieved a gold-level performance in the International Mathematical Olympiad (IMO) for the first time.

The companies are hailing the moment as an important milestone for AIs that might one day solve hard scientific or mathematical problems, but mathematicians are more cautious because details of the models’ results and how they work haven’t been made public.

The IMO, one of the world’s most prestigious competitions for young mathematicians, has long been seen by AI researchers as a litmus test for mathematical reasoning that AI systems tend to struggle with.

After last year’s competition held in Bath, UK, Google DeepMindannounced that AI systems it had developed, called AlphaProof and AlphaGeometry, had together achieved a silver medal-level performance, but its entries weren’t graded by the competition’s official markers.

Before this year’s contest, which was held in Queensland, Australia, companies including Google, Huawei and TikTok-owner ByteDance, as well as academic researchers, approached the organisers to ask whether they could have their AI models’ performance officially graded, says Gregor Dolinar, the IMO’s president. The IMO agreed, with the proviso that the companies waited to announce their results until 28 July, when the IMO’s full closing ceremonies had been completed.

OpenAI also asked if it could participate in the competition, but after it was informed about the official scheme, it didn’t respond or register an entry, says Dolinar.

On 19 July, OpenAI announced that a new AI it had developed had achieved a gold medal score marked by three former IMO medallists separate from the official competition. The AI answered five out of six questions correctly in the same 4.5-hour time limit as the contestants, OpenAI said.

Two days later, Google DeepMind also announced that its AI system, called Gemini Deep Think, had achieved gold with the same score and time limits. Dolinar confirmed that this result was given by the IMO’s official markers.

Unlike Google’s AlphaProof and AlphaGeometry systems, which were crafted especially for the competition and worked with questions and answers written in a computer programming language called Lean, both Google and OpenAI’s models this year worked entirely in natural language.

Working in Lean meant the AI’s output could be instantly checked for correctness, but it is harder for non-experts to read. Thang Luong at Google, who worked on Gemini Deep Think, says the natural language approach could produce more understandable answers, as well as being applicable to generally useful AI systems.

Luong says the ability to verify solutions in a large language model has been made possible thanks to progress with reinforcement learning, a training method in which an AI is taught what success looks like and is left to figure out the rules and how to succeed solely through trial and error. This method was key to Google’s previous success with its game-playing AIs, such as AlphaZero.

Google’s model also considers multiple solutions at once, in a mode called parallel thinking, as well as being trained on a dataset of maths problems specifically useful for the IMO, says Luong.

OpenAI has released few details on its system, apart from that it also uses reinforcement learning and “experimental research methods”.

“The progress is promising, but not performed in a controlled scientific fashion, and so I will not be able to assess it at this stage,” says Terence Tao at the University of California, Los Angeles. “Perhaps once the companies involved release some papers with more data, and hopefully enough access to the model for others to replicate the results, one can say something more definitive, but, for now, we largely have to trust the companies themselves for the claimed results.”

Geordie Williamson at the University of Sydney in Australia agrees. “I think it is remarkable that this is where we’re at. It is frustrating how little detail outsiders are provided with regarding internals,” says Williamson.

While systems working in natural language could be useful for non-mathematicians, it could also present a problem if models produce long proofs that are hard to check, says Joseph Myers, one of the organisers of this year’s IMO. “If AIs are ever to produce solutions to significant unsolved problems that might plausibly be correct but might also have a few subtle but fatal errors hidden accidentally, or potentially deliberately from a misaligned AI, having those AIs also generate a formal proof is key to having confidence in the correctness of a long AI output before attempting to read it.”

Both companies say that, in the coming months, they will offer these systems for testing to mathematicians at first, before releasing them to the wider public. The models could soon help with harder scientific research problems, says Junehyuk Jung at Google, who worked on Gemini Deep Think. “There are going to be many, many unsolved problems within reach,” he says.

Topics:



Source link : https://www.newscientist.com/article/2489248-deepmind-and-openai-claim-gold-in-international-mathematical-olympiad/?utm_campaign=RSS%7CNSNS&utm_source=NSNS&utm_medium=RSS&utm_content=home

Author :

Publish date : 2025-07-22 17:05:00

Copyright for syndicated content belongs to the linked Source.

Previous Post

Is it Time to Modernize the Residency Match Process?

Next Post

CGM Receiver Recall; Interchangeable Novolog Biosimilar OK’d; Barbie With Diabetes

Related Posts

Health News

‘Genuinely healing’ after 13 years of endometriosis pain

August 18, 2025
Health News

Covid-19 seems to age blood vessels – but only among women

August 18, 2025
Health News

Vascular Aging May Explain Long COVID’s Predominance in Women

August 17, 2025
Health News

‘We Call It the Google Maps Effect’: What We Heard This Week

August 17, 2025
Health News

It’s High Time States Banned PBM-Owned Pharmacies

August 17, 2025
Health News

Do Patients Care About Doctors’ White Coats?

August 16, 2025
Load More

‘Genuinely healing’ after 13 years of endometriosis pain

August 18, 2025

Covid-19 seems to age blood vessels – but only among women

August 18, 2025

Vascular Aging May Explain Long COVID’s Predominance in Women

August 17, 2025

‘We Call It the Google Maps Effect’: What We Heard This Week

August 17, 2025

It’s High Time States Banned PBM-Owned Pharmacies

August 17, 2025

Do Patients Care About Doctors’ White Coats?

August 16, 2025

Ultraprocessed vs Minimally Processed Food; Alteplase After Stroke

August 16, 2025

So You Want to Be a Medfluencer

August 16, 2025
Load More

Categories

Archives

August 2025
MTWTFSS
 123
45678910
11121314151617
18192021222324
25262728293031
« Jul    

© 2022 NewsHealth.

No Result
View All Result
  • Health News
  • Hair Products
  • Nutrition
    • Weight Loss
  • Sexual Health
  • Skin Care
  • Women’s Health
    • Men’s Health

© 2022 NewsHealth.

Go to mobile version