Guide · AI agent token cost

Why your AI coding agent burns tokens — and how a code map fixes it

Your agent reads the whole repository to answer one question, spending tokens on files that never mattered. The cause isn't the model — it's that the agent has no map. Give it one, and discovery-heavy tasks get 30–90× leaner.

The hidden cost: agents read the whole repo to answer one question

An AI coding agent — Claude Code, Cursor, or any other — can only reason about code it has loaded into context. To answer “where does authentication happen?” it first has to find the auth code, and without a map of the codebase it does that the only way it can: it lists directories, greps for keywords, and opens files until it has seen enough. A typical 500-line file is 3,000–5,000 tokens, and exploring an unfamiliar repository routinely burns 50,000+ tokens before any real work begins.

It gets worse over a session. Most agents re-send the entire transcript — every file already opened — on every turn. So token usage grows with the size of the codebase and the length of the conversation, not with the difficulty of the task. That is how a single afternoon of agentic coding turns into a surprising bill.

Why a bigger context window isn't the fix

The instinct is to throw more context at the problem. But a larger window only lets the agent hold more files — it still has to discover which files matter, and packing the window with marginally-relevant code measurably degrades answer quality while costing more per turn. The lever that actually works is the opposite one: send less, more precisely. The right neighborhood of code beats the whole repository every time.

A code map gives the agent the right neighborhood, not the whole repo

A code map — also called a code knowledge graph — is a pre-computed index of every file, symbol, import, and call in your project. Instead of crawling, the agent queries it: “what does this task touch?” returns a focused, ranked slice of code with exact path:line anchors. The agent reads only those slices. Discovery stops being a token sink and becomes a lookup.

What it looks like in tokens

Task	Raw crawl	With a code map	Margin
Understand a new architecture	84K	2.1K	40× leaner
Find where auth happens	31K	0.9K	34× leaner
Trace a call chain	47K	1.3K	36× leaner
Plan a refactor (impact first)	92K	2.4K	38× leaner
Read every file (the old way)	240K	—	window exceeded

Illustrative token counts for common agent tasks, raw file-crawl vs. a graph query.

How OpenVisio builds the map

OpenVisio parses your repository with tree-sitter into a graph of files, symbols, and resolved imports, then ranks what matters with PageRank — no LLM, fully deterministic, and entirely on your machine. It is local-first and read-only: nothing leaves your computer. The result is a map your agent can query over MCP, a visual graph you can explore on the web, and a narrator that explains the code and cites the real path:line behind every answer.

Works with Claude Code, Cursor, and any MCP client

OpenVisio runs as an MCP server, so any compatible agent can resolve task context, find a symbol, trace who depends on a file, or pull a ranked repo skeleton — structured queries instead of blind file reads. The agent gets the context it needs; you stop paying for the context it doesn't.

Frequently asked questions

Why do AI coding agents use so many tokens?

To answer almost any question, an agent first has to find the relevant code. Without a map of the repository it does that by reading — listing directories, grepping, and opening files until it has enough context. Each file it opens is re-sent on every following turn, so token usage grows with the size of the codebase and the length of the session, not with the difficulty of the task.

Does a bigger context window solve the problem?

No. A larger window lets the agent hold more files, but it still has to discover which files matter, and large contexts degrade answer quality and cost more per turn. The fix is sending less, more precisely — the right neighborhood of code instead of the whole repository.

How does a code map reduce token usage?

A code map (or code knowledge graph) is a pre-computed index of every file, symbol, import, and call. The agent queries it for exactly the neighborhood a task touches and reads only those slices, so it spends tokens on relevant code instead of crawling. In practice that is a 30–90× reduction on discovery-heavy tasks.

Does OpenVisio send my code to the cloud?

No. OpenVisio is local-first and read-only. It parses your repository on your machine with tree-sitter and serves the graph to your agent over MCP. Nothing leaves your machine, and the indexing step uses no LLM — it is fully deterministic.

Which AI coding agents does it work with?

Any MCP-compatible client, including Claude Code and Cursor. OpenVisio runs as an MCP server that exposes the graph, so the agent can resolve context, find symbols, trace dependents, and pull a ranked repo skeleton instead of reading files blindly.

Give your agent a map of the codebase

Point OpenVisio at any repo, wire it into your agent over MCP, and watch the token bill drop.

Open OpenVisio