How We Built a Virtual Filesystem for Our Assistant

Curated April 5, 2026 1 min read

llm-agentsragvector-databaseschromafilesystem-abstractiondocumentationself-hostinginfrastructure-cost

My notes

Summary

Mintlify replaced real sandbox containers for their docs AI assistant with a virtual filesystem (ChromaFs) that intercepts UNIX shell commands and translates them into queries against an existing Chroma vector database. This dropped session creation from ~46 seconds p90 to ~100 milliseconds and eliminated per-conversation compute costs entirely. The key insight is that agents only need the illusion of a filesystem, not a real one.

Key Insight

RAG limitation: top-K chunk retrieval breaks when answers span multiple pages or require exact syntax. Filesystem exploration (grep, cat, ls, find) is a better interface for agents navigating structured docs.
Cost of real sandboxes: at 850k conversations/month, even minimal micro-VMs (1 vCPU, 2 GiB, 5-min sessions) would cost >$70k/year. Not viable for a frontend assistant where users see a loading spinner.
ChromaFs design: built on Vercel Labs’ just-bash (TypeScript bash reimplementation with pluggable IFileSystem interface). ChromaFs implements that interface by translating every syscall into a Chroma query against the existing docs database.
Boot time: ~46s (sandbox) vs ~100ms (ChromaFs). Marginal cost per conversation: ~$0.0137 (sandbox) vs ~$0 (reuses existing DB).
Directory tree: stored as gzipped JSON in Chroma itself (__path_tree__). Deserialized into memory on init; ls/cd/find resolve with zero network calls. Tree is cached across sessions for the same site.
Access control: isPublic and groups fields in the path tree allow per-user RBAC with a few lines of filtering - no Linux user groups or per-tenant container images needed.
Grep optimization: intercepts grep -r, parses flags with yargs-parser, translates to Chroma $contains/$regex coarse filter, bulk-prefetches matching chunks into Redis, then hands back to just-bash for in-memory fine filtering. Result: large recursive greps complete in milliseconds.
Lazy file pointers: large files (e.g. OpenAPI specs from S3) appear in ls but content only fetches on cat. Keeps memory overhead low.
Read-only by design: all writes throw EROFS. Stateless, no session cleanup, no cross-agent corruption risk.