Google evaluated Gemini Deep Research’s capabilities using two benchmarks called HLE and DeepSearchQA. According to the company, it achieved record performance on both tests.
Nous Research's open-source Nomos 1 AI model scored 87/120 on the notoriously difficult Putnam math competition, ranking ...
When OpenAI announced its new shopping search capabilities, I took the news with a grain of salt (perhaps the whole shaker). For the past decade, we have watched the slow evolution of traditional ...
Homelessness is not a market-incentive problem. It is a challenge produced by intersecting failures in housing, affordability ...
Elon Musk has confirmed that the next major AI model from xAI, Grok 4.20, will launch within the next 3–4 weeks. The model ...
The AI giant has released GPT-5.2, a new frontier model for professional knowledge work, long-running agents, and complex ...
An engineer for New York Times Games has been trying to teach artificial intelligence to understand wordplay more like a human.
As AI takes center stage, the real win is making sure we can actually trust its decisions — and that’s why verifiable AI is ...
According to the study, generative AI produces a moderately strong positive effect on higher-order thinking overall. Across ...
The new framework from Tongyi Lab enables agents to create their own training data by exploring and interacting with new software environments.
Amazon Web Services Inc. envisions a world in which billions of AI agents will be working together. That will take a significant advance in frontier model reasoning, and the company made several major ...
Deepseek version 3.2 packs 671B parameters with 37B active at inference, giving you faster tool use and lower run costs on ...