LLM Wiki - Building Persistent Knowledge Bases with LLMs
2 min read
Originally from gist.github.com
View source
My notes
Summary
Karpathy describes a pattern where an LLM incrementally builds and maintains a persistent, interlinked markdown wiki from raw sources - rather than re-deriving knowledge from scratch at every query (RAG style). The wiki is a compounding artifact: each new source, query, and lint pass makes it richer and more interconnected. The human curates sources and asks questions; the LLM does all the bookkeeping.
Key Insight
- Core distinction from RAG: In RAG, the LLM rediscovers knowledge every query. In the wiki pattern, knowledge is compiled once and kept current - cross-references already exist, contradictions already flagged, synthesis already reflects everything ingested.
- Three layers: Raw sources (immutable), the wiki (LLM-owned markdown files), and the schema (CLAUDE.md / AGENTS.md that governs how the LLM maintains the wiki). The schema is the key config - it makes the LLM a disciplined maintainer, not a generic chatbot.
- Three operations:
- Ingest: a single source can touch 10-15 wiki pages (summaries, entity pages, concept pages, index, log)
- Query: answers can be filed back as new wiki pages - explorations compound just like ingested sources
- Lint: periodic health check for contradictions, stale claims, orphan pages, missing cross-references
- index.md vs log.md: index is content-oriented (catalog of all pages with summaries); log is append-only chronological record - parseable with simple unix tools using consistent prefixes like
## [2026-04-02] ingest | Title - Scaling the index: At ~100 sources / hundreds of pages, the index file works without embedding-based RAG. For larger wikis, qmd (local BM25/vector hybrid search with MCP server) is recommended.
- Why it works: Humans abandon wikis because maintenance cost grows faster than value. LLMs don’t get bored, don’t forget cross-references, and can update 15 files in one pass.
- Real-world adaptation (Vibe Sensei): The pattern was implemented in a trading terminal with per-symbol wiki pages, dual compilation (LLM + template fallback), incremental compilation via
.compile-state.json, and a compounding loop where query results are filed back as new wiki articles. Guardian alerts inject ~400 chars of per-symbol wiki context into every alert. - Tooling stack mentioned: Obsidian (IDE for the wiki), Obsidian Web Clipper, Marp (slide decks from wiki), Dataview (frontmatter queries), qmd (search engine), git (version history free).