# Graphify: Knowledge Graph Skill for AI Coding Assistants

> Graphify builds a multi-modal knowledge graph (Tree-sitter + LLM extraction, Leiden communities) so AI coding assistants grasp large codebases at 71x fewer tokens.

Published: 2026-04-16
URL: https://daniliants.com/insights/graphify-knowledge-graph-skill-for-ai-coding-assistants/
Tags: knowledge-graph, claude-code, tree-sitter, code-intelligence, rag-alternative, networkx, leiden-algorithm

---

## Summary

Graphify is an open-source (MIT) skill that builds a multi-modal knowledge graph from code, docs, papers and diagrams so AI coding assistants can understand large codebases without sending raw source to the model. It combines Tree-sitter static analysis with LLM-driven semantic extraction, builds a NetworkX graph, and uses Leiden community detection, no vector embeddings required. Reports 71.5x token reduction on a mixed Karpathy corpus (~1.7k tokens per query vs ~123k naive).

## Key Insight

- **Graphs beat vector RAG for code understanding.** Code has explicit structure (ASTs, call graphs, imports) that vectors flatten away. Graphify keeps that structure intact and layers semantic labels on top.
- **No embeddings, no vector store.** Leiden community detection runs on graph topology alone, sidestepping the whole embedding-model + vector-DB stack most RAG setups need.
- **Privacy-conscious by default.** Raw source never leaves the local machine. Only semantic descriptions (docstrings, concepts) go to the LLM, and it uses the model key the assistant already has configured.
- **Token compression at scale stays linear.** On a ~500k-word corpus, BFS subgraph queries stay around 2k tokens vs ~670k naive. That's the real value prop, query cost doesn't explode with codebase size.
- **"God nodes" and "surprise edges"** are the concrete analytical outputs: highest-degree nodes in the graph (architectural keystones) plus unexpected cross-file/cross-domain connections (design smells or hidden coupling worth investigating).
- **Multi-modal means it reads diagrams too.** Vision models extract concepts from images and PDFs, so architecture diagrams and research papers get merged into the same graph as the code.
- **Distribution note:** PyPI package is `graphifyy` (double y), CLI is `graphify`. Easy to fat-finger.
- **Assistant integration is via slash commands** (`/graphify`, `/graphify query`, `/graphify path`, `/graphify explain`) with manifests for Claude Code, OpenAI Codex, OpenCode. Any assistant that can run shell commands can invoke it.
- **Outputs are portable artifacts:** `graph.html`, `graph.json`, `GRAPH_REPORT.md`. The graph persists, gets cached, and can be regenerated incrementally.
- **3.7k+ GitHub stars and MIT license** signal real traction, commercially usable, with dependencies (NetworkX BSD, Tree-sitter MIT) all permissive.