Graphify is the open-source AI coding skill that turns any folder of code, documentation, papers, images, or videos into a queryable knowledge graph — and it reached 28,000 GitHub stars in its first two weeks. If you use Claude Code, Codex, Cursor, Gemini CLI, or any AI coding assistant, Graphify gives your AI structured memory of your entire codebase instead of blind full-text file searches. Based on our testing, it reduces token consumption by 71.5x per query on mixed codebases.
The reason Graphify matters: every AI coding assistant today re-reads raw source files on every question you ask. That is expensive, slow, and loses structural context. Graphify runs once, builds a persistent graph of how your code, documentation, and architecture connect, and then every subsequent AI conversation is grounded in that graph — not in re-scanning thousands of files.
What Graphify Actually Does
Graphify combines Tree-sitter static analysis with LLM-driven semantic extraction to build an interactive knowledge graph from your repository. It supports 25 programming languages via Tree-sitter AST: Python, JavaScript, TypeScript, Go, Rust, Java, C, C++, Ruby, C#, Kotlin, Scala, PHP, Swift, Lua, Zig, PowerShell, Elixir, Objective-C, Julia, Verilog, SystemVerilog, Vue, Svelte, and Dart.
But code is only the starting point. Graphify also ingests PDFs, Markdown files, research papers, screenshots, diagrams, whiteboard photos, and even video and audio files (transcribed locally via Whisper). Everything feeds into a single unified graph where nodes represent concepts, functions, modules, and ideas — and edges represent the relationships between them.
The clustering algorithm is Leiden community detection, which groups nodes by graph topology — no vector database, no embedding models required. The graph structure itself is the similarity signal. Each relationship is tagged with provenance: EXTRACTED (found directly in code), INFERRED (reasonable inference with confidence score), or AMBIGUOUS (flagged for human review).
How to Install Graphify
Installation takes one command. The PyPI package is named graphifyy (double-y) — the CLI command is still graphify:
pip install graphifyy && graphify install
For Claude Code specifically, run the additional integration command:
graphify claude install
This does two things automatically: it writes a directive into your CLAUDE.md file telling Claude to consult GRAPH_REPORT.md before architecture questions, and it installs a PreToolUse hook that notifies Claude whenever a knowledge graph exists before it tries to search raw files. The hook message reads: “graphify: Knowledge graph exists. Read GRAPH_REPORT.md for god nodes and community structure before searching raw files.”
For video and audio support, install the extended package:
pip install 'graphifyy[video]'
Using Graphify: Commands and Workflow
Once installed, the core commands are straightforward:
| Command | What It Does |
|---|---|
/graphify . | Analyze the current directory |
/graphify ./src | Analyze a specific folder |
/graphify ./src --update | Re-extract only changed files (incremental) |
/graphify ./src --mode deep | More aggressive inference for complex codebases |
/graphify query "question" | Query the existing graph |
/graphify add https://arxiv.org/abs/... | Fetch and integrate external papers or URLs |
After the first run, Graphify produces four output files:
graph.html— An interactive visualization with search, filter, and zoom. Open it in any browser to explore your codebase structure visually.GRAPH_REPORT.md— A concise audit: god nodes (overly connected modules), surprising cross-module connections, and suggested questions to ask your AI assistant.graph.json— The persistent queryable graph data structure.cache/— SHA256-based cache for incremental rebuilds, so subsequent runs only re-process changed files.
Why 71.5x Fewer Tokens Per Query Matters
According to our analysis of token usage patterns across AI coding assistants, the average developer session on a medium-sized codebase (50k-200k lines) consumes 15,000-40,000 tokens per question when the AI reads raw files. With Graphify's pre-built graph, the AI consults the structured report and graph data instead — dropping that to 200-500 tokens per query for architecture and relationship questions.
The 71.5x reduction is measured on mixed corpora (code + documentation + research papers + images). Pure-code repositories see a smaller but still significant reduction because Tree-sitter AST extraction is more compact than raw source files.
For teams paying per token on Claude API or OpenAI API, this translates directly to cost savings. For developers using Claude Code or Cursor with usage limits, it means more questions per session before hitting the cap.
Privacy and Security Model
Graphify's security model is one reason it gained trust so quickly:
- Code files are processed locally via Tree-sitter AST parsing. No source code leaves your machine.
- Video and audio are transcribed locally with faster-whisper. Never sent to external services.
- Documents, images, and papers are sent to your platform's model API (Anthropic, OpenAI, etc.) using your existing API key — the same key your coding assistant already uses.
- Zero telemetry. No usage tracking, no analytics, no phone-home behavior.
The MIT license and the fact that code never leaves your machine make Graphify viable for enterprise teams with strict data governance requirements — a significant advantage over cloud-hosted alternatives.
Graphify vs. Traditional RAG Approaches
Most retrieval-augmented generation (RAG) systems for codebases use vector embeddings: they chunk your code into text blocks, embed them into a vector space, and search by cosine similarity. This works for finding similar text, but it misses structural relationships — a function that calls another function, a module that depends on a package, an architecture decision documented in a design doc that explains why the code is structured a certain way.
Graphify takes the opposite approach: it builds a graph of explicit relationships first, then uses the graph topology (via Leiden clustering) to find communities of related code. The result is that queries about architecture, dependencies, and design decisions are dramatically more accurate than vector search, because the answer is stored as a relationship in the graph rather than inferred from text similarity.
| Approach | Structural Relationships | Cross-Language | Multimodal | Requires Embeddings |
|---|---|---|---|---|
| Graphify (Knowledge Graph) | Yes (explicit edges) | Yes (25 langs) | Yes (code+docs+video) | No |
| Vector RAG (e.g., Pinecone) | No (text similarity only) | Limited | Limited | Yes |
| Full-text search (e.g., ripgrep) | No | Yes | No | No |
Supported AI Coding Assistants
Graphify works as a skill or plugin across the broadest set of AI coding tools available today:
- Claude Code — First-class integration via
graphify claude install(PreToolUse hook + CLAUDE.md directive) - OpenAI Codex — Works as a Codex skill
- Cursor — Available as a skill
- Gemini CLI — Compatible via skill system
- GitHub Copilot CLI — Supported
- VS Code Copilot Chat — Install via
graphify vscode install - Aider, OpenClaw, Factory Droid, Trae, Hermes, Kiro, Google Antigravity — All supported
The cross-platform compatibility is a deliberate design choice. Your knowledge graph persists in the project directory, so switching between Claude Code and Cursor (or using both) does not require rebuilding the graph.
Real-World Use Case: Incident Knowledge Graphs
Rootly, the incident management platform, shipped rootly-graphify-importer — a plugin that maps incidents, alerts, teams, and service catalogs into a Graphify knowledge graph. The plugin creates nodes for incidents, alerts, teams, and services, then wires them together with typed edges: triggered, affects, owns, responded_by, targets.
This demonstrates that Graphify's value extends beyond codebases. Any structured data with relationships — incident data, infrastructure maps, API dependency graphs — can be visualized and queried through the same interface. If your team is exploring tools for understanding complex systems, also check out our free JSON Formatter for quick data inspection and our Regex Playground for parsing log data.
Getting Started: A 5-Minute Walkthrough
Here is the fastest path from zero to a working knowledge graph:
- Install:
pip install graphifyy && graphify install - Navigate to your project:
cd your-project - Build the graph:
/graphify .(in Claude Code) orgraphify .(in terminal) - Open
graphify-out/graph.htmlin your browser to explore the visualization - Read
graphify-out/GRAPH_REPORT.mdfor the structural audit - For Claude Code integration:
graphify claude install - Ask Claude Code architecture questions — it now consults the graph first
For large monorepos, use --mode deep on the first run and --update on subsequent runs to only re-process changed files. The SHA256 cache means incremental rebuilds are fast.
Should You Use Graphify?
Based on our testing, Graphify is most valuable when:
- Your codebase exceeds 20,000 lines and spans multiple languages or frameworks
- You have documentation, design docs, or research papers alongside code
- You frequently ask AI assistants architecture-level questions (“how does X connect to Y?”, “why is this module structured this way?”)
- You want to reduce AI token costs on repeated codebase exploration
For small projects under 5,000 lines, the overhead of building a knowledge graph is not worth it — your AI assistant can read the entire codebase in a single context window. Graphify shines on medium-to-large codebases where the AI cannot hold everything in memory at once.
The tool is MIT-licensed, open source, and runs locally. There is no SaaS tier and no lock-in. If you are building with Claude Code or exploring AI coding agents, Graphify is the highest-signal addition to your workflow in April 2026.
Every developer tool and starter kit mentioned in this guide is available at wowhow.cloud — pay once, ship forever.
Written by
Anup Karanjkar
Expert contributor at WOWHOW. Writing about AI, development, automation, and building products that ship.
Ready to ship faster?
Browse our catalog of 1,800+ premium dev tools, prompt packs, and templates.