How infrawise Catches the DynamoDB Scan You Didn't Know You Were Making

#opensource #typescript #dynamodb #mcp

Your Orders table has 50 million rows. Claude Code wrote a listAllOrders() function that calls .scan() with no filter. It compiled. Tests passed. Friday morning, your DynamoDB bill had a new line item.

The problem isn't the AI — it's that the AI had no way to know. infrawise solves this by building a deterministic model of your actual infrastructure and exposing it through MCP before any code gets written. This post is about how the scan detection actually works under the hood.

Step 1: Scanning the Repository with ts-morph

When you run infrawise analyze, the first pass is a TypeScript/JavaScript AST scan using ts-morph. infrawise walks every source file looking for database client call expressions — DynamoDB DocumentClient.scan, .query, .get; PostgreSQL pg.query; Mongoose model methods.

For each call site it finds, it records three things: the containing function name, the target table or collection, and the operation type. A .scan() call becomes an edge with type scan. A .query() call becomes a query edge. These edges are the raw material for the graph.

The limitation is real and documented: only TypeScript and JavaScript are supported. Dynamically constructed queries — where the table name or operation is assembled at runtime from a variable — may not resolve. infrawise handles what static analysis can handle and flags the rest.

Step 2: Infrastructure Introspection

In parallel, infrawise calls your AWS APIs directly. For DynamoDB it reads every table's actual schema: partition key, sort key, every GSI with its projection type and key schema, item count, billing mode. For Lambda it reads function configurations, memory, timeouts, and event source mappings. SQS queues, SNS topics, SSM parameters, Secrets Manager secrets, RDS instances, and CloudWatch log groups are all pulled the same way — deterministic API calls, no inference.

This is what separates it from passing your Terraform files to an AI. Reading a .tf file tells you what should exist. Calling dynamodb.describeTable tells you what does exist, right now.

Step 3: Building the Graph

The graph engine connects the AST output to the infrastructure metadata. Each DynamoDB table, Lambda function, SQS queue, and RDS instance becomes a node. The call sites from the AST scan become typed edges between function nodes and table nodes: scan, query, get, publishes_to, uses_index.

The result is a queryable graph. You can ask: which function nodes have scan edges pointing to the Orders table node? That's exactly the query the FullTableScanAnalyzer runs.

Step 4: The 24 Analyzers

infrawise ships 24 rule-based analyzers. Each one is a graph traversal or a schema comparison — no model, no inference.

FullTableScanAnalyzer calls getScanEdges, which filters all graph edges where type === 'scan'. For each edge that points to a DynamoDB table node, it records the table and the calling function, then emits a HIGH severity finding. No threshold, no heuristic — any .scan() on a DynamoDB table is flagged:

  1.  HIGH   Full table scan detected on DynamoDB table "Orders"
             The table "Orders" is being scanned without any filter,
             which reads every item. This is expensive and slow for
             large tables. Called from: listAllOrders
             → Replace Scan with a Query operation using a partition
               key or GSI. If filtering is required on non-key
               attributes, add a Global Secondary Index (GSI).

The other analyzers follow the same pattern. MissingGSIAnalyzer finds tables that have query edges but no uses_index edges — tables being queried with no GSI coverage. HotPartitionAnalyzer counts distinct function nodes with edges to the same table; at five or more, it fires MEDIUM. MissingIndexAnalyzer compares PostgreSQL query predicates against the introspected pg_indexes view. NplusOneAnalyzer looks for repeated query edges from the same function in a loop pattern. Every rule is structural.

How This Reaches Your AI Assistant

Running infrawise dev starts a Fastify MCP server on Streamable HTTP. Claude Code connects to it and can query 13 tools — get_infra_overview, analyze_function, suggest_gsi, postgres_index_suggestions, and others.

When Claude Code is about to write a query against Orders, it calls analyze_function first. The response includes the table schema, any existing GSIs, and the scan finding if one was detected. The AI writes a query with the correct partition key instead of a scan — not because it's smarter, but because it now has the same information a senior engineer would check before touching the table.

For Claude Desktop, infrawise stdio starts the same server on stdio transport.

Conclusion

The scan finding is the most visible output, but the real work is the graph: AST edges from ts-morph connecting function call sites to infrastructure nodes from live AWS APIs, traversed by 24 deterministic rules. No LLM touches the analysis path.

If you're running Claude Code against a codebase with DynamoDB tables, npm install -g infrawise and infrawise init in your repo. The first infrawise analyze usually finds something your AI assistant would have gotten wrong.

GitHub · npm

Key Takeaways

infrawise uses ts-morph to parse TypeScript/JavaScript source into a graph of function-to-table edges, typed by operation (scan, query, get).
AWS infrastructure metadata comes from live API calls — not Terraform, not static files — so the graph reflects what actually exists.
24 rule-based analyzers traverse the graph deterministically; FullTableScanAnalyzer flags any .scan() edge to a DynamoDB table as HIGH with no threshold.
Context is exposed through an MCP server (Streamable HTTP for Claude Code, stdio for Claude Desktop) so AI tools see findings before they generate code.
The analysis path contains zero LLMs — every finding is a graph query or schema comparison.