
A shot of the graph view of my LLM wiki
In April 2026, OpenAI cofounder Andrej Karpathy, now a Member of Technical Staff at Anthropic, posted an architectural overview on GitHub. The project itself, which he called an “LLM wiki,” has generated significant buzz, though nothing approaching OpenClaw, the agent framework that became one of the most successful projects in GitHub history. The tweet that preceded it cleared more than 21 million views at the time of writing. The gist itself has thousands of stars and a comment thread that reads like a small open-source conference: secure-llm-wiki, AutoSci, Dense-Mem, LLM-Wiki-v3, memwiki and a dozen other forks.
The basic gist of the gist is relatively simple. Instead of pointing a model at a folder of raw documents and retrieving chunks on every query, which is essentially how RAG works, you have the model maintain a persistent, interlinked set of markdown pages that sit between you and the sources. Add a new source and the model reads it, updates the relevant entity and concept pages, flags where it contradicts what is already there, and files a log entry. The knowledge gets compiled once and kept current rather than rediscovered from scratch each time you ask.
If you can faintly recall a fact from months ago and bothered to add it to the wiki, an LLM can retrieve it in moments.
What I built with coding agents
My raw layer in my private LLM wiki is a mix of articles, interview transcripts, press releases, PDFs, system cards and more. The wiki layer is built on Quartz, an open-source, customizable static-site generator. My site now has roughly 760 pages. (The video here shows an outdated count of about 200.) It is organized into companies, topics, products and platforms, sources, people and institutions.
About 430 of those have been reviewed enough to appear on the public site; the rest are drafts and skeletons the build keeps hidden.
How do you prevent it from turning into a tangled mess? Glad you asked. A schema document lays out the conventions for the agent, spelling out which section headings are allowed. It also dictates that every factual sentence needs an inline citation, that a claim without a source is a defect rather than a stylistic choice. A bank of lint scripts enforces discipline, so a page that breaks the rules fails a gate instead of slipping through on the agent’s good intentions.
Because I brainstorm with agents to help organize my thoughts, I wired the whole thing to an MCP server. This means the model I am talking to (now Claude on the web) can search the wiki, read pages, pull in new sources, and propose edits through a guarded intake path instead of writing files directly. Maybe this was superfluous, since you can do much the same with coding agents directly in the repository. But it is still a bit useful to bring the functionality to the web.
Can it be somewhat automated to amass more knowledge over time? Yes, but I haven’t fully wired it yet. I do have a monitoring loop that now runs on its own, appending dated, sourced facts to pages that have already been reviewed, and every automated edit writes a ledger entry I can audit after the fact. Anything heavier, such as a new page or a restructure, still goes through the semi-manual path.
So is it worth it?
At one month, the time I spend maintaining the wiki and the time it saves me are roughly a wash. I find it helps me synthesize information on technical subjects where it has good coverage. For instance, one thing that used to bug me was asking an LLM to synthesize data on other LLMs. If you ask a frontier model which models are the most powerful, you often get a stale or partial answer. It will tell you about the newest model from one lab but might miss another, or fail to mention a relatively strong entry from, say, a Chinese company. After I added primary system cards and third-party coverage, I can get a pretty good overview, though sites like Artificial Analysis already do this and go deeper than my wiki does.
The other caveat is that the wiki doesn’t update itself by default. I have to keep updating it, but that is a pretty light lift and easy to fold into my normal workflows.
If you are doing research that is already well covered on the open web, I don’t think the effort is likely worth it unless you are really going deep on a given subject.
One month in, the LLM wiki is moderately helpful and promising for narrow cases. In the long term, it could either continue to get better or grow so much that its scale demands a new architecture. Because I am not breaking new ground and there are a lot of people out there building their own LLM wikis, the risk is low that the whole project dissolves into a pile of rubble.
The LLM wiki also doesn’t remove the need for manual fact-checking, but so far, the results have been pretty solid because I am asking the model to basically summarize from other documents. That doesn’t mean there is zero risk of hallucination, but there are ways to reduce it, ranging from manual review to adversarial review by agents that draw on the source material.




Tell Us What You Think!
You must be logged in to post a comment.