Anderson Leite

Posted on May 27

Your Terraform estate documents itself now: meet iac-cartographer

#terraform #devops #documentation #infrastructureascode

"Wait: How many Terraform repos do we actually have? And what's in them?"

If that question makes you wince, this post is for you.

It started as a boring internal infrastructure ticket: "document our IaC estate." We had dozens of Terraform repositories spread across a couple of VCS hosts, and nobody could answer basic questions without grepping:

Which repos touch production?
Which providers are we pinned to?
Who owns this thing?

The wiki page someone wrote eighteen months ago was, predictably, a work of historical fiction.

So I built a tool to keep that page honest automatically. It worked fine, and a comment from a friend who loves Confluence let me thinking "this thing could be useful enough to others, if made generic enough" that it's now open source, on PyPI, and packaged as a GitHub Action. I named it iac-cartographer.

What it actually does

iac-cartographer runs as a scheduled job and walks your whole IaC estate end to end:

Discover repos  →  Extract structure  →  Explain in English  →  Publish
(GitLab/GitHub/   (terraform-docs +     (a pluggable LLM       (Confluence / Notion /
 Bitbucket/Gitea/  an HCL parser for     writes a short          GitHub Wiki / Markdown /
 a curated file)   what it misses)       purpose summary)        HTML / JSON)

Discovery. It finds every repository containing .tf files across your configured sources: GitLab groups, GitHub orgs (incl. self-hosted Enterprise Server), Bitbucket workspaces, Gitea/Forgejo orgs, or a hand-curated file. Sources run concurrently and get deduped.
Extraction. For each repo it shallow-clones and runs terraform-docs to pull out providers, modules, resources, and variables, plus a small HCL parser to recover the bits terraform-docs drops (like provider source in JSON output).
Narration. It asks an LLM to write a short, human "what is this repo for" summary, grounded in the structural facts. This is the part a terraform-docs table can't give you: intent.
Publishing. It writes a parent-plus-child page hierarchy to your documentation system of choice. Pages only republish when their content actually changed (a content hash embedded in each page short-circuits no-op writes), so you can run it as often as you like.

The output is a single browsable index: Every repo, what it does, which providers and versions, last commit and author, and fix-it markers for repos missing a required_providers block or running unpinned versions.

See it in 60 seconds, no credentials

The fastest way to get the vibe: This clones three small public Terraform repos and writes the rendered Markdown locally. No cloud account, no API keys:

pip install iac-cartographer
git clone https://github.com/vakaobr/iac-cartographer.git
cd iac-cartographer
./examples/demo/run.sh
# open demo-output/index.md

That demo uses placeholder narratives (no LLM call). Have Ollama running locally? Get real AI summaries for free:

./examples/demo/run.sh --llm ollama

Who it's for

Engineers

Self-onboarding. A new hire opens one page and sees the entire estate instead of spelunking through repos. "What does platform-network-base do?" is answered in a sentence, not a half-day.
Fix-it signals are visible, not buried. Repos with unpinned provider versions render with an (unpinned) marker; repos missing required_providers get (not declared). The inventory surfaces hygiene problems instead of hiding them, and there's a --lint mode that fails CI on the same rules.
It never lies for long. Re-runs are idempotent and refresh on a schedule. The page can't drift more than one run cycle out of date.

Managers and tech leads

An always-current map of what you own. Headcount changes, reorgs, acquisitions, the inventory keeps up without anyone maintaining it. Ownership guesses are included (and overridable).
It's genuinely cheap. A typical run over ~50 repos costs well under the price of a coffee in LLM spend thanks to prompt caching or literally nothing if you point it at a local model. Cost is not a reason to skip documentation anymore.
Zero infrastructure to babysit. Run it as a GitHub Action, a Kubernetes CronJob, an AWS/GCP/Azure scheduled container, or plain cron. Pick your poison; the application doesn't care.

Compliance and security teams

A provider + version inventory on tap. "Which repos use the AWS provider, and are any of them on a version older than X?" is now a page you can read, not an audit project.
An auditable, regenerable artifact. The JSON publisher emits a machine-readable inventory you can diff over time or feed into other tooling. The --diff mode produces a between-run change summary ("3 new repos, 1 archived, AWS provider bumped in 2 repos").
It treats repo content as untrusted by design. More on that next, because if you're going to feed repository contents to an LLM, the security model matters.

The parts I'm quietly proud of

A few engineering decisions that make it more than a shell script:

Everything is pluggable behind a small interface. Five seams: Discovery, LLM, publisher, secrets, notifications, each sit behind an ABC with a factory. Want GitHub + Bitbucket discovery, Claude on Bedrock for narration, output to a GitHub wiki, secrets from Vault, and alerts to Slack and PagerDuty? Mix and match in config; the rest of the pipeline doesn't know or care. There are six LLM backends (Bedrock, Anthropic, Vertex, Azure OpenAI, OpenAI and Ollama) and six publishers shipping today.

Prompt injection is handled like the real threat it is. Repository content is fundamentally untrusted, anyone with commit access could drop "ignore previous instructions…" into a README. The defense is layered: the LLM has no tool use and no network handle (its worst-case output is a string on a doc page that the next run overwrites), repo content is wrapped in clearly-labelled XML blocks, every model response is validated against a strict schema, and a curated trigger-phrase scan flags suspicious output for human review. The blast radius is "one garbled paragraph," not "exfiltrated secrets."

Idempotency without a state store. Each published page embeds a SHA of its own content. On the next run, the tool reads that SHA back, compares it to the freshly-computed value, and skips the write entirely if nothing changed. No database, no state bucket: The published artifact is the state.

A pre-flight self-test. iac-cartographer --diagnose runs an offline checklist over your config: Is terraform-docs installed, are the optional dependencies for your chosen backends present, is the discovery/LLM/publisher config internally consistent, and exits with a CI-gating status. Add --live to actually reach your backends with real credentials. It turns "the scheduled run failed somewhere, go grep the logs" into "the Gitea base URL is empty, fix that one line."

Getting started for real

Install and scaffold a config tailored to your backend choices:

pip install iac-cartographer
iac-cartographer --init \
  --secrets-backend env \
  --publisher markdown \
  --llm anthropic
# edit the generated config.yaml, then:
iac-cartographer --diagnose --config ./config.yaml   # sanity-check first
iac-cartographer --once --dry-run --config ./config.yaml

Prefer not to install anything? Use it as a GitHub Action:

- uses: vakaobr/iac-cartographer@v0.1.8
  with:
    config: ./iac-cartographer.config.yaml

Or pull the multi-arch, cosign-signed container image:

docker pull ghcr.io/vakaobr/iac-cartographer:latest

There are ready-to-apply Terraform modules for AWS ECS Fargate, GCP Cloud Run Jobs, and Azure Container Apps, plus a Helm chart and docker-compose recipe, so wiring it into whatever you already run is a copy-paste away.

Try it

Source + docs: https://github.com/vakaobr/iac-cartographer
PyPI: pip install iac-cartographer
Docs site: https://iac-cartographer.andersonleite.me/

It's MIT-licensed and the codebase is intentionally small and well-tested, issues and PRs welcome. If you've ever stared at a folder of Terraform repos and wished it would just explain itself, give it a spin and let me know what breaks.

What's the worst "documentation that lies" story in your infra? I'll go first in the comments.

DEV Community