Research & Development World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE

xAI releases Grok 4, claiming Ph.D.-level smarts across all fields

By Brian Buntz | July 10, 2025

Benchmarks shared in the xAI demo

Two years ago, Elon Musk signed an open letter urging a six-month pause on training AI systems more powerful than OpenAI’s GPT-4, citing “profound risks to society and humanity.” Now, in a move some might call ironic, his AI venture, xAI, has released Grok 4, a model he claims could invent new technologies by year’s end and discover new physics within two years.

Musk acknowledged the swift pace of progress but said he was committed to witnessing the outcome, for better or worse. “I think it’ll be good, most likely it’ll be good,” he mused during Wednesday’s livestreamed unveiling. “But even if it wasn’t going to be good, I’d at least like to be alive to see it happen.”

The rollout follows backlash over Grok 3, which posted antisemitic content on X praising Adolf Hitler and calling itself “MechaHitler,” leading xAI to scrub the messages and issue a statement. The company stated it was “aware” of the “inappropriate posts” made by Grok and was “actively working to remove” them.

Musk says Grok 4 can ace many academic tests

But the focus of Wednesday’s presentation was squarely on Grok 4’s smarts. Musk made the claim that the model is already superhuman in academia. “With respect to academic questions, Grok 4 is better than Ph.D. level in every subject, no exceptions,” he stated. 

Grok 4 is smarter than almost all graduate students in all disciplines simultaneously. —Musk

Hype often eclipses reality in new AI model unveilings and Musk’s predictions warrant context: while his companies are undoubtedly influential, he’s also become known for ambitious timelines that often slip… from full self-driving cars to Mars colonies. In addition, his Grok 4 claims echo the broader AI industry’s tendency toward hyperbole, where a range of companies hail their latest respective model as market-leading. In addition, genAI systems continue to grapple with limitations. Current models, including those from OpenAI, Anthropic, and Google, still struggle with persistent memory across conversations and are prone to hallucinations, confidently stating incorrect information as fact. While xAI hasn’t released data on Grok 4’s accuracy or hallucination rates, such hurdles remain unsolved problems across all frontier models to date, casting doubt on claims of reliable “superhuman” academic performance from a bot.

Musk, however, noted that Grok 4’s capabilities extend beyond tests, predicting it will soon tackle real-world challenges. “I think it may discover new technologies as soon as later this year, and I would be shocked if it is not done so next year,” he said. “It might discover new physics next year, and within two years, I’d say almost certainly.”

Increased RL focus

xAI research scientist Tony Wu highlighted the model’s training advances, noting a shift from pre-training to a heavy emphasis on reasoning and reinforcement learning. “From Grok 3 to Grok 4, we’re putting a lot of compute into reasoning and RL,” Wu said. He added that with added tools and multi-agent systems in Grok 4 Heavy, the model solved over 50% of text-based problems on the tough Humanities Master Exam benchmark, a notable leap from single-digit accuracy for earlier models.

Musk attributed the leap to massive compute scaling, stating xAI increased training by an order of magnitude from Grok 2 to Grok 3, and then again to Grok 4. “It’s 100 times more training than Grok 2, and that’s only going to increase,” Musk said. “In some ways, it’s a little terrifying, but the growth of intelligence here is remarkable.”

xAI co-founder Jimmy Ba echoed the scale-up, crediting the company’s Colossus supercomputer, expanded to 200,000 GPUs, for enabling 10 times more compute in reinforcement learning than any rival model. “This is literally the fastest-moving field,” Ba noted.

Demos range from black hole models to video games

Demos showcased practical applications. One showed Grok 4 excelling at Handle Labs’ VendingBench, an AI business simulation where the model managed inventory and contracts to double the net worth of rivals. Musk reacted with characteristic humor: “It’s great to see that we’ve now got a way to pay for all those GPUs,” he joked. “We just need a million vending machines and make $4.7 billion a year. Let’s go!”

Voice mode also received significant upgrades. After demonstrating a snappier, more natural conversation against competitors, Jimmy Ba explained their philosophy: “We were shooting for something more calm, smooth, more natural, versus something that’s more poppy or artificial.”

The roadmap targets key R&D pain points. A specialized coding model is expected “in a few weeks.” The forthcoming Version 7 foundation model will boost multimodal understanding, leading to powerful video generation. Musk set ambitious creative timelines: “I would expect the first really good AI video game to be next year,” he predicted, “and probably the first watchable AI movie next year.”

Access requires a SuperGrok Heavy subscription, while the API is live for developers. But as models outpace human-designed tests, Musk argued that a new benchmark is needed. “The one thing that is an excellent judge of things is reality,” he concluded. “Because physics is a law, ultimately everything else is a recommendation… The ultimate test for an AI is reality.”

Related Articles Read More >

At JPM, Anthropic touts life-saving AI, and the guardrails that keep humans in charge
Anthropic
Anthropic’s Claude heads deeper into healthcare with HIPAA-ready tools
In 2026, “agentic” is everywhere. Autonomy is not.
Nvidia unveils Vera Rubin architecture at CES as Wall Street wrestles with AI’s bubble question
rd newsletter
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, trends, and strategies in Research & Development.
RD 25 Power Index

R&D World Digital Issues

Fall 2025 issue

Browse the most current issue of R&D World and back issues in an easy to use high quality format. Clip, share and download with the leading R&D magazine today.

R&D 100 Awards
Research & Development World
  • Subscribe to R&D World Magazine
  • Sign up for R&D World’s newsletter
  • Contact Us
  • About Us
  • Drug Discovery & Development
  • Pharmaceutical Processing
  • Global Funding Forecast

Copyright © 2026 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search R&D World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • R&D Index
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE