oubakiou

Posted on Jun 19

Your LLM reads the whole file. It doesn't have to.

#ai #llm #markdown #opensource

Coding agents read specs, design docs, and long READMEs every day. Most of the time, they only need a few sections. Yet they load the entire file into context.

The hidden cost of "just read the file"

Here's a scenario that plays out constantly. You ask your agent to check the error handling section of a 5,000-line API spec. The agent opens the file, reads all 5,000 lines into its context window, finds the 80 lines it needs, and answers your question.

The result is correct. But the agent also consumed a large number of tokens on the 4,920 lines it didn't need. Repeat this for every file read in a session, and the waste compounds fast.

The cost isn't just tokens. A context window stuffed with irrelevant content makes the agent's answers worse.

What humans do differently

When a human picks up a 300-page technical book, they don't read cover to cover to find the chapter on authentication. They flip to the table of contents, scan the chapter titles, and jump to page 47. LLMs can do the same thing.

Markdown is already structured

Markdown documents have a built-in structure: headings. A # Title followed by ## Section A followed by ### Subsection A.1 creates a hierarchy that mirrors a book's table of contents.

Split a Markdown file at heading boundaries, and you get a natural "table of contents + sections" structure. Each heading starts a new section, the heading text becomes the index entry, and the section number becomes the address.

This is the idea behind md2idx, a CLI tool.

How to use md2idx

md2idx converts a Markdown file into JSON with two fields:

index: a numbered table of contents where each line is <# markers for depth> <serial>. <heading text>
sections: a flat array of raw Markdown strings, one per heading

$ npx md2idx spec.md | jq -r '.index'
# 0. API Specification
## 1. Authentication
## 2. Endpoints
### 3. GET /users
### 4. POST /users
## 5. Error Handling
## 6. Rate Limiting

The serial numbers match the array indices. To read the Error Handling section:

$ npx md2idx spec.md | jq -r '.sections[5]'
## Error Handling

When a request fails, the API returns a JSON error object with...
(just the content of that one section)

To read a heading and all its children together:

$ npx md2idx spec.md | jq -r '.sections[2:5][]'
# Endpoints, GET /users, and POST /users — all 3 sections

For a 5,000-line spec where the agent needs 2 sections, context usage goes from ~5,000 lines to ~100 lines (20-line index + 80 lines of content). Depending on the document and which sections are needed, the reduction is typically 80–98%.

The output is designed to work with jq. One-line JSON by default (pipe-friendly), --pretty for formatted output. Reads from a file argument or stdin.

Why not just grep for headings?

grep -nE '#{1,6} ' spec.md gives you a list of headings. For simple cases, that works. But md2idx covers problems that grep can't solve:

Section body retrieval: grep only returns heading lines. To get the body, you need to calculate the line range and use Read with offset/limit. With md2idx, jq '.sections[N]' is all it takes
Setext headings (=== / ---): invisible to grep's # pattern
# inside code fences: grep returns false positives. md2idx skips fenced blocks
Inline markup: grep includes [link](url) etc. as-is. md2idx strips markup in the index while preserving it in section content

Automate with a Claude Code skill

Skill details (SKILL.md)

With the md2idx-read skill, the agent autonomously handles everything from fetching the index to selecting sections.

The agent checks the file size and, if large, calls md2idx to fetch the index
It reads the index and identifies which sections are relevant to the current task
It retrieves only those sections via jq slicing
If more sections are needed, it goes back to step 3 (the index is already in context)

# install the skill
gh skill install oubakiou/md2idx md2idx-read --agent claude-code --scope project

# or with npx
npx skills add oubakiou/md2idx --skill md2idx-read --agent claude-code --yes

Once installed, the agent uses the skill proactively whenever it encounters a large Markdown file. No manual invocation needed — it reads the index first, picks sections, and skips the rest.

A fallback is included. If md2idx isn't available (network-restricted environments, permission issues), it falls back to grep for headings and Read tool with offset/limit. Less accurate, but functional.

Try it now

# read the index of any Markdown file
npx md2idx README.md | jq -r '.index'

# grab a specific section by number
npx md2idx README.md | jq -r '.sections[2]'

# search within a section
npx md2idx data.md | jq -r '.sections[4]' | grep Tokyo

# global install
npm install -g md2idx

md2idx has zero external dependencies — a self-contained line scanner, not a Markdown AST parser. It handles ATX headings (# style), setext headings (=== / --- underlines), code fence skipping, and inline markup stripping.

md2idx is MIT-licensed and fully open source. If your LLM agents are reading entire large Markdown files, give it a try:

GitHub repo →
npm → npx md2idx your-file.md | jq -r '.index'
Claude Code skill → gh skill install oubakiou/md2idx md2idx-read

If you've tried it in your agent workflow, I'd love to hear how it went — drop a comment below or open an issue on GitHub.

DEV Community