
Letta
The original promise of the AI “agent,” a concept that took shape in the 1980s and 1990s, was one of stateful intelligence, a system that could learn from its environment because it maintained an internal memory of it. This core idea, however, has become a bit muddled following the rise of “agents” built on large language models. Examples range from “Deep Research” tools that can summarize hundreds of articles on a given topic or genAI-based command-line tools that can modify files on your computer.
Among the first research groups to tackle the problem of transformer-based AI amnesia was a foundational 2023 preprint paper titled “MemGPT: Towards LLMs as Operating Systems.” Authored by then-UC Berkeley Ph.D. students Charles Packer and Sarah Wooders along with their colleagues, the paper proposed a novel architecture. Drawing inspiration from the virtual memory systems in traditional computers, MemGPT gives the LLM itself the ability to manage a memory hierarchy, intelligently shuttling information between its limited, active context window (like RAM) and external storage. This gives the agent a form of persistent, long-term memory, thus enabling it to learn from interactions over time

Charles Packer, Ph.D.
As Packer, now the co-founder and CEO of the startup Letta, says of the MemGPT paper: “It was really the first research paper and open-source code that was addressing the memory issue with language models.” Packer goes on to highlight the contrast in having language models that can emulate human-like intelligence, but fundamentally lack memory. If a user tells a model in one conversation, “I’m allergic to peanuts,” and then starts a fresh conversation asking for recipe suggestions, that health information will be absent. “So these models have no way to remember or learn other than just putting more and more into the context window,” Packer said.
To take the MemGPT concept from academic research to an enterprise-grade solution, Packer and Wooders co-founded Letta, which has a corresponding GitHub. Their singular focus on solving the AI memory problem has attracted prominent AI figures as angel investors, including Jeff Dean, head of Google AI, and Clem Delangue, CEO and co-founder of Hugging Face.
Because the models are getting really, really good. So I think a lot of these hallucination issues are actually not model issues, they’re context window issues. —Sarah Wooders
An ongoing problem with R&D ramifications
This ad-hoc approach of simply “stuffing” the context window is a primary reason why today’s agents can feel unreliable. Without a structured memory, an agent can abruptly go off the rails during a long task or have its effective memory wiped clean if the program crashes. More pervasively, this lack of stateful memory is a key driver of hallucinations, also known as confabulations in an AI context. “Having memory, and explicit memory, grounds the model and helps prevent hallucinations,” Packer elaborated. “Because if you don’t have memory, what you’re doing is you’re basically just stuffing more and more into the context window, and it gets dirtier and dirtier and more confusing… of course it’s going to be confused, of course it’s going to hallucinate.”
Even the memory systems that are built into the state of the art with companies ranging from OpenAI to Sesame AI investing tidy sums to give their models the ability to remember user preferences and facts are not foolproof, and continue to lag far behind human memory, which itself is far from infallible. According to Sarah Wooders, Letta’s co-founder and a lead author on the MemGPT paper, when these advanced systems fail, the root cause is often not a flaw in the model’s reasoning, but a corruption of the data it’s actively working with. “I feel like whenever I’ve seen models or agents hallucinate, oftentimes you can actually tie it back to some issue in the context window,” Wooders said. The problems can take various forms: “Maybe contradictory information in the context window, or maybe information that you thought was in the context is actually not there.”

Sarah Wooders, Ph.D.
The resolve to address this memory problem is underscored by a surge in AI’s adoption within research and development. An analysis of 154 recent international patents for AI-powered research tools shows that patent filings in the category jumped by nearly 70% in 2024 compared to the previous year. A significant portion of these patents, more than 20%, representing 32 separate inventions, were for lab automation, an area where stateless “throwaway scripts” are common but a persistent, learning agent could be transformative.
This is the kind of high-stakes environment Packer envisions for stateful agents. While they may not invent new science from scratch, their ability to remember and cross-reference information makes them ideal for tackling one of R&D’s biggest challenges: reproducibility. “One of the powers you have… is that they can actually be very good at maybe not inventing new science, but they can be very good at verifying existing work,” Packer said. This capability is key in fields like biology where results can be hard to reproduce. An agent with persistent memory could track experimental parameters, compare results across multiple runs, and flag inconsistencies. In essence, it could act as a tireless digital lab assistant.
The implications extend beyond the lab to the core of how businesses operate. Packer points to Letta’s deployment with the company Bilt Rewards as a “sneak peek into what the future is.” There, agents with MemGPT-style memory watch user transactions to build rich profiles, replacing a traditional black-box recommender system. He argues that this represents a fundamental shift in enterprise data. “In the future, when a company wants to understand what’s going on with their customers, the canonical information is not in Salesforce. It’s in an agent’s memory.”
‘Sleeptime compute,’ or when models use subconscious agents to think 24/7
This potential of agents that can remember has led Packer and Wooders to push their own R&D beyond simple memory retrieval into a new frontier they call “sleeptime compute.” The concept is an evolution of the industry’s focus on “test-time compute,” where models think longer to get a better answer. With sleeptime compute, the agent thinks all the time. “Why doesn’t it reflect on everything we’ve ever talked about? Why doesn’t it attempt to solve my problems in advance?” Packer mused. “That’s really one of the main angles behind sleeptime compute. You have these agents think all the time, not just when the user is waiting for them.”
Beyond enabling futuristic capabilities like “sleeptime compute,” this structured memory approach offers a more immediate advantage for any organization deploying AI today: interpretability. When an AI operates within a massive, undifferentiated context window, its failures become opaque mysteries. But by creating explicit, well-managed memories, the MemGPT architecture provides a “black box” flight recorder. That is, it can allow developers and scientists to understand exactly why an agent made a mistake.
Packer explains that this is a fundamental difference in design philosophy. As Packer noted: “Let’s say you might not ever really drive hallucinations to zero. I mean, humans still hallucinate, right? But what happens when a hallucination does actually occur?”
Another very good thing about explicit memory and well-written memories is that it actually makes everything a lot more interpretable. —Packer
But if you have “a very well-processed memory and your context window is very organized,” Packer said, “then you can just have a human look inside and quickly skim and say, ‘Oh, actually the reason it hallucinated is because the memory was wrong,’ or ‘the reason it hallucinated is because the tool that’s in here is wrong.'” That’s difficult to do if a user is looking at what caused a hallucination when reviewing the input and output from a large language model
When memory becomes the core business asset
Looking ahead, Packer envisions a transformation in how businesses operate and understand their customers. “Right now, every single business that has customer data stores the customer data in a very normalized fashion in a CRM or database. It’s all partitioned, and any software you’re writing that’s going to do analysis on your customers is doing analysis on this very structured database or CRM,” he explains.
But this paradigm is about to shift. “I think what we’re going to see as everything becomes more and more driven by AI is that actually the canonical form of business data about your customer will be memories of your customer.”