Research & Development World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • Call for Nominations: The 2025 R&D 100 Awards
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
    • Explore the 2024 R&D 100 award winners and finalists
  • Resources
    • Research Reports
    • Digital Issues
    • R&D Index
    • Subscribe
    • Video
    • Webinars
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE

Meet Reactor Mk.1, an LLM developed under $1M

By Brian Buntz | January 28, 2025

The AI landscape has long been dominated by a “bigger is better” mantra. The basic thinking is that bigger and better GPUs, more sophisticated algorithms, and more data translates into better performance. And while that is generally true, those assumptions can be simplistic. “DeepSeek is really, really good,” wrote one anonymous employee at Google on the site Blind. “I can’t believe that it is open source and free.”

But while DeepSeek is perhaps the most famous entrant to demonstrate that novel approaches such as a reinforcement learning-first pipeline can be tapped to train a model on a budget markedly less than those from the likes of OpenAI, Google or Antrhopic, it is not the only one. ARC’s Reactor Mk.1was trained on 8 NVIDIA L4 and 4 NVIDIA A100 GPUs for under $1 million, Reactor reportedly outperforms offerings from OpenAI, Anthropic, Meta, and Google on key benchmarks.

[From https://reactor.arc.ai/chat]

A spokesperson for ARC noted, “ARC trained the model with just eight NVIDIA L4 and four NVIDIA A100 GPUs and under $1m—compared to DeepSeek’s reported $6m training costs.” He continued: “ARC’s overall MMLU score was 92.9% compared to 88.5% for DeepSeek V3. MMLU, which stands for Massive Multitask Language Understanding, is a comprehensive benchmark designed to evaluate the performance of AI language models across a wide range of subjects and tasks.”

In quick unscientific testing by the author, it passed the infamous “strawberry” question (correctly surmising that there are three r’s in the word) and concluded that 8.11 was larger than 8.9, which earlier LLMs sometimes struggled with. When asked to draft a Python function to optimize energy use in a data center, where server workloads fluctuate hourly, withe the inclusion of edge cases for power grid failures and renewable energy variability, it provided five basic steps with an explanation for each and outlined a Python function with placeholders. While the model effectively accounted for edge cases like grid failures and renewable variability in its scripted outline, it still stumbled on a minor syntax detail by omitting a closing brace. By contrast, Claude 3.5 Sonnet smoothly handled the same task without that syntax slip. DeepSeek R1 was not available for testing at the time of writing.

For the sake of comparison, the MMLU percentage for DeepSeek V3 (the previously released model before R1) was 88.5%. The breakdown of various players’ scores is as follows:

BENCHMARK PERFORMANCE SCORES OF REACTOR MK.1 AND OTHER MODELS ON MMLU, HUMANEVAL, AND BBH (reproduced from the arXiv preprint)
Model MMLU HumanEval BBH
ARC
Reactor Mk.1
92.9% 91% 88%
OpenAI
GPT4o
88.7% 90.2% 83.1%
Anthropic
Claude
86.8% 84.9% —
Meta
Llama3
86.1% 84.1% —
Google
Gemini
81.9% 71.9% 83.6%
OpenAI
GPT 3.5
70% 48.1% 66.6%
Mistral
8×228
77.75% — —

Reactor Mk.1’s competitive edge appears to stem from its efficient architecture built on the Lychee AI engine, which is described in the arXiv paper. The paper reports notes that the performance was “accomplished with a handful of GPUs,” suggesting a potential breakthrough in training efficiency. It notes: “The benchmark scores indicate that the ARC Reactor Mk. 1 not only outperforms in understanding and generating code but also demonstrates huge performance in reasoning and handling challenging language tasks. These results position the ARC Reactor Mk. 1 as a leading model in the current state of the art of AI technolog”

Related Articles Read More >

Open-source Boltz-2 can speed binding-affinity predictions 1,000-fold
New Gemini 2.5 Pro model achieves top-tier science and coding performance while costing 1/8th the price of OpenAI’s o3
Berkeley Lab’s Dell and NVIDIA-powered ‘Doudna’ supercomputer to enable real-time data access for 11,000 researchers
Scientific lab
Google Cloud, Dexcom and Recursion see AI agents shifting from demo to practical lab applications
rd newsletter
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, trends, and strategies in Research & Development.
RD 25 Power Index

R&D World Digital Issues

Fall 2024 issue

Browse the most current issue of R&D World and back issues in an easy to use high quality format. Clip, share and download with the leading R&D magazine today.

Research & Development World
  • Subscribe to R&D World Magazine
  • Enews Sign Up
  • Contact Us
  • About Us
  • Drug Discovery & Development
  • Pharmaceutical Processing
  • Global Funding Forecast

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search R&D World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • Call for Nominations: The 2025 R&D 100 Awards
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
    • Explore the 2024 R&D 100 award winners and finalists
  • Resources
    • Research Reports
    • Digital Issues
    • R&D Index
    • Subscribe
    • Video
    • Webinars
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE