
[Image created via OpenAI’s image generation technology]
That line, on page 3 of OpenAI’s latest “Preparedness Framework” (Version 2, updated on April 15, 2025), signals a potential paradigm shift for the R&D ecosystem, which is quickly moving from being an eager if not always accurate intern stage to a a potential colleague — or even a lead researcher.
Looking ahead, the framework grapples with the potential for AI to become “recursively self improving.” It warns that the resulting “major acceleration in the rate of AI R&D” could rapidly introduce new capabilities and risks. This acceleration could outpace current safety measures, rendering oversight “insufficient” and explicitly flagging the danger of losing “maintaining human control” over the AI system itself.
Speaking at a Goldman Sachs event just weeks earlier on March 5 (released on April 11 on YouTube), OpenAI CFO Sarah Friar reinforced this view, stating that models are already “coming up with novel things in their field” and moving beyond merely reflecting existing knowledge to “extend that.” Friar further noted the rapid approach towards Artificial General Intelligence (AGI), suggesting “We may be there.”
While acknowledging the ongoing debate with some experts balking at even the term AGI let alone its feasibility — at least with large language models), Friar mentioned CEO Sam Altman’s view that Artificial General Intelligence (AGI) – AI handling most valuable human work — could be “imminent.” This suggests the transition from AI as a tool for researchers to AI as a researcher may be closer than many realize, with early examples potentially emerging in fields like software development.
Leading R&D institutions are actively building ‘autonomous research’ capabilities. For example, national laboratories like Argonne and Oak Ridge are developing ‘self-driving labs’ specifically designed for materials science and chemistry. Los Alamos is also working with OpenAI
testing its reasoning models in energy and national security applications on its Venado supercomputer.
In general, national labs are exploring the use of AI to take on core research tasks: generating hypotheses (often via optimization strategies), designing multi-step experiments, controlling robotic execution, analyzing results in real-time, and iterating towards discovery goals with significantly reduced human intervention within specific operational domains. While still requiring human oversight for validation and strategic direction — functioning perhaps at a ‘Level 3’ or emerging ‘Level 4’ of research autonomy — such initiatives demonstrate AI moving beyond passive data analysis to participate directly in the scientific discovery process.This push extends beyond building integrated systems; it involves directly empowering researchers, as seen in the recent DOE ‘1,000 Scientist AI Jam.’ This large-scale collaboration brought together some 1,500 scientists across multiple national labs, including Argonne, to test advanced AI reasoning models from companies like OpenAI and Anthropic on real-world scientific problems. Researchers specifically explored their potential to enhance tasks like hypothesis generation and experiment automation.
A similar transition is already afoot in software development, although developers currently have mixed views about the potential of genAI-enabled tools. Today’s AI often serves as an assistant, but the technology is quickly upping its software game, especially for common languages ranging from JavaScript to Python. OpenAI’s models are demonstrating significant progress, “approaching human level” on key benchmarks, with Friar noting one is already “literally the best coder in the world.” This underpins the potential Friar described for an “agentic software engineer,” an AI that “can go out and do work independently for you,” including building, testing, and documenting applications. This evolution towards more autonomous capabilities could potentially reshape the field entirely.
OpenAI’s 5-level AI
maturity framework
OpenAI reportedly uses an internal five-level framework to benchmark its progress towards Artificial General Intelligence (AGI). This structure, discussed within the company in mid-2024 and later reported by outlets like Bloomberg, outlines distinct stages of AI capability:
- Level 1: Chatbots / Conversational AI: Systems adept at natural language, like ChatGPT.
- Level 2: Reasoners: AI capable of basic problem-solving comparable to a highly educated human. At this level, models also can demonstrate emerging reasoning skills without external tools.
- Level 3: Agents: Autonomous AI systems that can manage complex tasks and make decisions over extended periods on behalf of users.
- Level 4: Innovators: AI contributing significantly to creativity and discovery by generating novel ideas, assisting invention, or driving breakthroughs.
- Level 5: Organizations: The apex stage where AI can potentially manage and operate the complex functions of an entire organization, potentially exceeding human efficiency.
In general, national labs are exploring the use of AI to take on core research tasks: generating hypotheses (often via optimization strategies), designing multi-step experiments, controlling robotic execution, analyzing results in real-time, and iterating towards discovery goals with significantly reduced human intervention within specific operational domains. While still requiring human oversight for validation and strategic direction — functioning perhaps at a ‘Level 3’ or emerging ‘Level 4’ of research autonomy — such initiatives demonstrate AI moving beyond passive data analysis to participate directly in the scientific discovery process.This push extends beyond building integrated systems; it involves directly empowering researchers, as seen in the recent DOE ‘1,000 Scientist AI Jam.’ This large-scale collaboration brought together some 1,500 scientists across multiple national labs, including Argonne, to test advanced AI reasoning models from companies like OpenAI and Anthropic on real-world scientific problems. Researchers specifically explored their potential to enhance tasks like hypothesis generation and experiment automation.
A similar transition is already afoot in software development, although developers currently have mixed views about the potential of genAI-enabled tools. Today’s AI often serves as an assistant, but the technology is quickly upping its software game, especially for common languages ranging from JavaScript to Python. OpenAI’s models are demonstrating significant progress, “approaching human level” on key benchmarks, with Friar noting one is already “literally the best coder in the world.” This underpins the potential Friar described for an “agentic software engineer,” an AI that “can go out and do work independently for you,” including building, testing, and documenting applications. This evolution towards more autonomous capabilities could potentially reshape the field entirely.
Tell Us What You Think!
You must be logged in to post a comment.