Why Your AI Keeps Forgetting Everything You Told It

Apr 13, 2026Dishant Sharma8 min read

The Problem with Asking AI the Same Question Twice

Someone on Hacker News asked the question that stops most people cold. "Isn't this just kicking the can down the road?" They were responding to Andrej Karpathy's LLM Wiki, a pattern that proposes we stop using RAG and start letting AI build persistent knowledge bases instead. The critique was sharp: unless the wiki stays fully in context, the LLM just re-reads the wiki instead of the source files. And won't errors accumulate as we start regurgitating second-order information?

Karpathy posted his idea file on April 3, 2026. It went viral immediately. The former OpenAI co-founder and Tesla AI director described something that sounds simple: dump raw documents into a folder, have an LLM compile them into an interlinked markdown wiki, and let the AI maintain it. No vector databases. No chunking. No retrieval pipelines. Just structured markdown files that grow smarter over time.

His own research wiki now has about 100 articles and 400,000 words. He rarely touches it directly.

That's the part that breaks people's brains.

We're so used to AI as a tool we operate. Karpathy is treating it as a collaborator that owns the boring parts. The filing. The cross-referencing. The tedious maintenance that kills every human wiki project after the initial enthusiasm wears off.

Why RAG Feels Like Starting Over Every Time

i used to think RAG was the answer. You upload documents, the system chunks them, converts them to embeddings, and retrieves relevant bits when you ask questions. It works. But here's what actually happens.

Every query is a fresh start. The LLM rediscovers knowledge from scratch. Ask something subtle that requires synthesizing five documents, and it has to find and piece together fragments every single time. Nothing accumulates. No cross-references get built. No evolving synthesis emerges from repeated interaction.

NotebookLM works this way. ChatGPT file uploads work this way. Most RAG systems work this way. And they're fine for one-off questions. But they're terrible for deep research where you're reading papers, articles, and reports over weeks or months. The context keeps slipping away. The connections between ideas never get documented.

Here's a question people always ask: why not just use a regular wiki? Write the notes yourself, organize them manually, build the connections over time. And the answer is simple. Humans abandon wikis because the maintenance burden grows faster than the value.

The tedious part isn't reading or thinking. It's the bookkeeping. Updating entity pages. Revising topic summaries. Noting contradictions between sources. Strengthening cross-references. This is exactly the kind of work AI is good at. The mechanical, structured, consistent stuff that requires patience humans don't have.

Karpathy's insight was to flip the architecture. Instead of retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki. When you add a source, the AI reads it, extracts key information, and integrates it into existing pages. Updates entity descriptions. Revises summaries. Flags contradictions. The knowledge is compiled once and kept current.

The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read. The wiki keeps getting richer with every source you add.

What the Three-Layer Architecture Actually Looks Like

most tutorials tell you to start with the tools. Install Obsidian. Set up Claude Code. Configure the folder structure. But what actually matters is understanding the three layers.

Raw sources are your curated collection of documents. Articles, papers, images, data files. These are immutable. The LLM reads from them but never modifies them. This is your source of truth.

The wiki is a directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, an overview, a synthesis. The LLM owns this layer entirely. It creates pages, updates them when sources change, and maintains all the links between them.

The conversation is where you interact with the system. You ask questions, explore connections, request updates. The LLM uses the wiki as its working memory while talking to you.

In practice, Karpathy keeps his LLM agent open on one side and Obsidian on the other. The LLM makes edits based on their conversation. He browses results in real time, following links, checking the graph view, reading updated pages. Obsidian is the IDE. The LLM is the programmer. The wiki is the codebase.

The pattern works across domains. Personal tracking of goals and health. Research projects spanning months. Book reading with character and theme pages. Business wikis fed by Slack threads and meeting transcripts. Competitive analysis. Trip planning. Anything where knowledge accumulates over time.

The Community Response: Excitement and Skepticism

my coworker tried to explain why this changes everything. He said RAG is interpreted knowledge work. The LLM Wiki is compiled knowledge work. And that distinction matters more than people realize.

The response to Karpathy's idea file was immediate and intense. Multiple open-source implementations appeared within days. llmwiki.app launched as a free tool. GitHub repos with names like "karpathy-wiki" and "Karpathy-LLM-Wiki-Stack" started collecting stars. One developer built a Chrome extension that bookmarks tweets, articles, and YouTube videos into the wiki pipeline.

But the Hacker News thread also surfaced real concerns. The "kicking the can down the road" critique wasn't the only one. People worried about error accumulation. If the LLM summarizes a source, then summarizes its own summary, do errors compound? What about hallucinations getting baked into the wiki as facts?

Others noted that this only works at a certain scale. Below thousands of documents, RAG might be simpler. Above millions, you probably need vector databases anyway. The LLM Wiki sits in a middle zone. Rich enough to benefit from structure. Small enough to fit in context.

Some commenters pointed out that current generation models still have context limits. If your wiki grows beyond what fits in a single conversation, you're back to retrieval problems. Just retrieving from your wiki instead of raw sources.

Why Naming Your Wiki After a Star Wars Planet Helps

i need to tell you about my friend who names every project after obscure sci-fi references. His knowledge base is called "Terminus." From Asimov's Foundation series. The remote planet where all human knowledge gets preserved.

He says it helps him remember what the thing is for. Not just a folder of notes. A living archive. A second brain that persists beyond any single session.

This is the part most technical discussions miss. The psychology of working with AI-maintained knowledge. When you name the thing, when you watch it grow, when you see connections form in the graph view, it becomes real. You start treating it with respect. You curate sources more carefully. You ask better questions.

Karpathy mentioned reading a book and building out pages for characters, themes, plot threads. By the end you have a rich companion wiki. Think of Tolkien Gateway, he said. Thousands of interlinked pages built by volunteers over years. You could build something like that personally as you read.

But here's what i think about. The volunteers who built Tolkien Gateway cared deeply about the material. They didn't just dump text and auto-generate summaries. They made judgments about what mattered. They wrote with voice and perspective.

An LLM Wiki without human curation is just a more organized dump. The magic happens when you engage with it. Challenge the summaries. Request different angles. Ask the LLM to explore connections you hadn't considered.

Most People Don't Need This

Let's be blunt. If you're doing casual research, RAG is fine. If you need answers from a few documents, just upload them and ask. The LLM Wiki pattern adds complexity that only pays off at scale.

This is overkill for small projects. If you're reading one book, take normal notes. If you're following a single research thread for a week, use whatever system you already have. The maintenance overhead of setting up the wiki pipeline only makes sense when you're accumulating knowledge across months or years.

And there's a bigger misconception. People think the LLM Wiki replaces human thinking. It doesn't. It replaces human filing. The tedious organization that makes most personal knowledge management systems fail. You still need to curate sources. You still need to ask good questions. You still need to verify the AI's work.

Karpathy's own wiki has 400,000 words he rarely touches directly. But he touches it constantly through conversation. The AI does the grunt work. He does the thinking.

Where This Goes

i still think about that Hacker News comment. "Isn't this just kicking the can down the road?"

The honest answer is maybe. If context windows keep growing, the difference between RAG and compiled knowledge might blur. If multimodal models get better at reasoning over raw documents, the wiki layer might become unnecessary.

But here's what i think. The insight isn't about technology. It's about division of labor. Humans are good at judgment, curiosity, asking the right questions. AI is good at patience, consistency, mechanical organization. The LLM Wiki pattern respects both.

Karpathy said something in his idea file that stuck with me. "The tedious part of maintaining a knowledge base is not the reading or the thinking. It's the bookkeeping."

That's the whole thing in one sentence. We're not building AI to replace our minds. We're building it to handle the parts of knowledge work that make us quit. The filing. The updating. The endless maintenance that turns enthusiasm into obligation.

The LLM Wiki won't kill RAG. They're different tools for different scales. But it might change how we think about AI collaboration. Less like a search engine. More like a research partner who never gets tired of organizing the notes.

Ask yourself this. What would you learn if you weren't worried about forgetting it?