Recursive Language Models
RAG is a filing cabinet. An RLM is a reader. Where RAG pre-fetches and dumps context, a Recursive Language Model loops — reading, deciding what else to read, then reading again — until it has enough to answer.
The Problem with RAG
Retrieval-Augmented Generation has become the default pattern for giving LLMs access to external knowledge. You chunk documents, embed them, store vectors, retrieve the top-k at query time, and stuff them into the context window. It works. But it has a fundamental flaw: the model has no agency over what it reads.
The RAG bottleneck: All retrieval decisions happen before the model reasons. The model can't ask follow-up questions of the knowledge base mid-thought.
What is an RLM?
A Recursive Language Model interleaves reading and reasoning in a loop — query, reason, read, evaluate, loop or answer:
RLMs on Cloudflare Durable Objects
A Durable Object is the REPL environment. Each DO instance is a stateful session with its own storage, its own LLM loop, and its own inverted keyword index. No external vector store needed.
class RecursiveMemory extends DurableObject {
async runLoop(query: string, model: string): Promise<string> {
const tools = [
{ name: 'search', fn: (q) => this.keywordSearch(q) },
{ name: 'read', fn: (id) => this.readChunk(id) },
];
const { text } = await generateText({
model, prompt: query, tools,
maxSteps: this.config.maxDepth,
});
return text;
}
}
RLM vs RAG
| Dimension | RAG | RLM |
|---|---|---|
| Retrieval timing | Pre-query, one-shot | Mid-reasoning, iterative |
| Model agency | None | Full — model decides what to read |
| Latency | Low | Higher (multiple steps) |
| Best for | Simple Q&A | Complex reasoning, multi-hop |
The Hypermedia Connection
There's a reason html.surf is built the way it is. HTML pages linked by <a href> tags are the natural substrate for RLM navigation. An agent reading this article can follow the link to Free The Web, decide it needs more context, and follow another link — exactly the way a human researcher would. The link graph is the retrieval index.
This is the thesis of Free The Web: HTML is readable by humans and machines alike. An RLM navigating html.surf pages is indistinguishable from a human doing research. That's the goal.
Further Reading
- Free The Web — five principles for the next web
- Why HTML Wins — the technical case for hypermedia
- The Surfable Web — knowledge graphs as websites
- Edge-First Architecture — where this all runs