How We Built a Virtual Filesystem for Our Assistant
1 min read
Originally from mintlify.com
View source
My notes
Summary
Mintlify replaced real sandbox containers for their docs AI assistant with a virtual filesystem (ChromaFs) that intercepts UNIX shell commands and translates them into queries against an existing Chroma vector database. This dropped session creation from ~46 seconds p90 to ~100 milliseconds and eliminated per-conversation compute costs entirely. The key insight is that agents only need the illusion of a filesystem, not a real one.
Key Insight
- RAG limitation: top-K chunk retrieval breaks when answers span multiple pages or require exact syntax. Filesystem exploration (grep, cat, ls, find) is a better interface for agents navigating structured docs.
- Cost of real sandboxes: at 850k conversations/month, even minimal micro-VMs (1 vCPU, 2 GiB, 5-min sessions) would cost >$70k/year. Not viable for a frontend assistant where users see a loading spinner.
- ChromaFs design: built on Vercel Labs’
just-bash(TypeScript bash reimplementation with pluggableIFileSysteminterface). ChromaFs implements that interface by translating every syscall into a Chroma query against the existing docs database. - Boot time: ~46s (sandbox) vs ~100ms (ChromaFs). Marginal cost per conversation: ~$0.0137 (sandbox) vs ~$0 (reuses existing DB).
- Directory tree: stored as gzipped JSON in Chroma itself (
__path_tree__). Deserialized into memory on init;ls/cd/findresolve with zero network calls. Tree is cached across sessions for the same site. - Access control:
isPublicandgroupsfields in the path tree allow per-user RBAC with a few lines of filtering - no Linux user groups or per-tenant container images needed. - Grep optimization: intercepts
grep -r, parses flags withyargs-parser, translates to Chroma$contains/$regexcoarse filter, bulk-prefetches matching chunks into Redis, then hands back to just-bash for in-memory fine filtering. Result: large recursive greps complete in milliseconds. - Lazy file pointers: large files (e.g. OpenAPI specs from S3) appear in
lsbut content only fetches oncat. Keeps memory overhead low. - Read-only by design: all writes throw
EROFS. Stateless, no session cleanup, no cross-agent corruption risk.