In January 2026, Google DeepMind CEO Demis Hassabis delivered one of the most direct critiques yet of OpenAI’s Artificial General Intelligence (AGI) strategy and the tech world took notice. Industry analysts labeled his comments a “bombshell,” as they openly challenged Sam Altman’s belief that scaling large language models (LLMs) like GPT-5 will inevitably lead to superintelligent systems. Instead, Hassabis argues that this approach has already hit its limits.
Why DeepMind Thinks Scaling LLMs Won’t Deliver AGI
Speaking on CNBC’s The Tech Download podcast, Hassabis described today’s LLMs as “statistical engines” that excel at predicting text but lack a true understanding of how the world works. According to him, models like ChatGPT do not possess causal reasoning the ability to understand why events happen, not just describe them. This limitation makes them unsuitable for complex scientific tasks such as discovering new medicines or modeling physical systems.
Hassabis believes that true AGI requires AI systems to develop internal “world models” simulations of reality that allow machines to reason about physics, cause-and-effect, and long-term consequences. Without these capabilities, he says, scaling transformers with more data and compute is a technological dead end.
DeepMind’s Shift Toward World Models
To support this vision, DeepMind unveiled two major systems in late 2025. Genie 3 generates interactive 3D environments from text prompts, teaching AI agents how physical worlds behave. SIMA 2, trained inside these environments, reportedly outperforms traditional LLM-based agents by 20–30% on complex reasoning benchmarks.
Hassabis claims that AGI will require at least two “AlphaGo-level breakthroughs” over the next decade breakthroughs that won’t emerge from simply making today’s language models larger.
Challenging OpenAI’s “PhD-Level” Claims
Hassabis also dismissed Sam Altman’s description of GPT-5 as having “PhD-level intelligence,” calling the claim “nonsense.” He pointed out that current chatbots still struggle with tasks like basic arithmetic or counting when prompts are phrased differently failures that contradict any notion of true expert-level reasoning.
Why This AI Rivalry Matters
The timing is crucial. Gemini 3.0’s strong performance reportedly triggered a “Code Red” inside OpenAI, marking the first time a competitor not only matched OpenAI’s capabilities but also publicly challenged its scientific roadmap.
As the race toward AGI intensifies, the divide between DeepMind’s world-model approach and OpenAI’s scaling-first strategy may shape the future of artificial intelligence. Whether AGI emerges through bigger models or fundamentally new architectures remains one of the defining questions of the decade.