Esin Saribudak for AWS

Posted on May 14 • Edited on May 15 • Originally published at builder.aws.com

The new Agent Toolkit for AWS includes 20+ agent skills, but your agent might never load them without this one file

#agentskills #aws #mcp #tooling

Prioritizing tools over training data

The Agent Toolkit for AWS gives your coding agent access to the AWS MCP and curated skills, but without updating the rules file, your agent might answer from model training data instead of using its new tools.

Last updated: May 15, 2026

If you're like me, sometimes you get so excited to try something new that you don't read all the way through the docs before you start using it. Last week, the Agent Toolkit for AWS had just been released, I had the README open, and two minutes later I was asking my agent to design a serverless backend in my Kiro IDE.

But the agent didn't touch any of the tools I had just configured. It answered from training data and gave me a reasonable API Gateway + Lambda + DynamoDB architecture, but never reached for the MCP documentation search or the aws-core skills I had installed. I had to prompt it to use the MCP server and skills before it gave them a go.

I had skipped one important file in my rush to try out this new toolkit, and it turned out to be the file that made my agent reach for these tools predictably.

What's in the toolkit

The Agent Toolkit for AWS was released on May 6, 2026. It works with Claude Code, Codex, Kiro, and any agent that supports MCP, and it has three layers:

The MCP Server gives your agent access to 300+ AWS APIs through one endpoint, plus sandboxed Python execution and real-time doc search (no AWS credentials needed for searching docs).
Skills are packaged domain expertise, including architecture decision tables, service comparison matrices, deployment workflows, and troubleshooting guides (20+ available today).
A rules file tells the agent to use layers 1 and 2 before answering from memory.

I had layers 1 and 2 set up but skipped layer 3, so the agent had all the tools and none of the instructions to use them.

Setup takes two minutes — follow the README's quick start for your agent (Kiro, Claude Code, Codex, or other MCP-compatible agents).

The file I skipped

There's a rules file in the toolkit repo's rules/ directory, which I missed when setting it up. The README's quick start section doesn't mention it, and I was already typing prompts by the time I could have noticed. But there's a difference between "can discover skills" and "will proactively load them before answering." The rules file bridges that gap. It's 17 lines and tells the agent to:

Prefer the AWS MCP Server for all AWS interactions
Before starting a task, check whether a relevant AWS skill is available
Load the skill with retrieve_skill and prefer its guidance over general knowledge
When uncertain about API parameters, permissions, or limits, verify against documentation rather than guessing

For more information on where to put the rules file for different agents, check out the docs here.

Before I added this file, the agent had passive access to skills, but after I dropped it in, skill loading became the agent's first move on any AWS question and it started pulling architecture decision tables before writing code.

For a simple CRUD app like the one I was building, the AWS MCP skills refined the implementation rather than redirected it. The LLMs have gotten so good that their general knowledge already gets you to the right architecture much of the time (in my case, API Gateway + Lambda + DynamoDB + Cognito). Where the skills added value was in specificity and confidence. Here's what I got before and after:

	Before (no rules file)	After (rules file added)
Architecture advice	Three options presented (serverless, containers, Amplify) with a general "Option 1 is the sweet spot" recommendation	One specific architecture with a decision table explaining why each component was chosen over its alternatives
API Gateway type	"API Gateway" (unspecified which type)	"HTTP API specifically, because REST API is overkill unless you need WAF or caching"
Auth approach	"Cognito or roll your own with Lambda + bcrypt + JWT"	"JWT authorizer with Cognito because HTTP API has native JWT support, no Lambda authorizer needed"
Function pattern	Not mentioned	"One function per route" (skill best practice)
Constraints flagged	None	30s hard timeout, 10 MB payload limit, no WAF on HTTP API, silent Forbidden on JWT scope mismatch
Source of guidance	Model training data	AWS documentation + `aws-serverless` skill's service selection tables
Level of specificity	Told me what to build	Told me which variant to build and why that variant over the alternatives

The skills would be even more useful than general LLM knowledge for more complex architectures involving things like event processing or multi-pattern designs, where the pattern selection flowcharts and service comparison tables would change the agent's choices rather than validate them.

What this taught me about coding agents and skills

Giving a coding agent access to tools is not the same as telling it when to use them. Skills are designed to be loaded on demand, which means something has to tell the agent when to demand them. Without a rule, the agent treats skills like reference books on a shelf: available if it decides to look, but not part of its default workflow. The rules file is what changes "available on request" into "check this before you start."

Guardrails

If you're wondering, "Is it safe to let an agent call AWS APIs?", that's a great question to be asking. The good news is that you can scope down what the agent is allowed to do separately from your own permissions, so even if your IAM role can create and delete resources, you can restrict the agent to read-only. Every request the agent makes through the MCP server gets logged, so you can trace which agent action caused it. And the skills have been tested as full end-to-end workflows before shipping, so when your agent follows a skill's steps, those steps are verified to produce the expected result.

Thanks for reading!

If you made it this far, thanks for spending the time. Let me know if you've tried Agent Toolkit for AWS and what you think!

Top comments (10)

Mykola Kondratiuk • May 17

the silent fallback to training data is the actual problem - missing rules file should error, not quietly answer from model memory. you can't debug what doesn't fail visibly

Esin Saribudak • May 18

Hi Mykola, thanks for reading! I think it depends on the situation and whether a certain task calls for that more specialized knowledge from the toolkit. The models have gotten so good that they can confidently handle a lot of day-to-day AWS tasks, so sometimes that training data is sufficient. You probably don't always want to have those skills loaded all the time, but knowing how to preload them when you want your agent using those specialized abilities is important.

Mykola Kondratiuk • May 18

fair for routine tasks - but in production you need to know which source answered. training-data fallback is invisible in logs; rules-file fallback is auditable. that gap does not matter in prototypes but it matters once you are running SLA-governed workflows.

Varsha Ojha • May 18

Interesting point. More agent skills don’t automatically mean better agent performance. The real challenge is context loading, skill selection, and knowing when the agent should use a tool versus reason through the task directly. Too much available capability can still fail if the agent can’t access the right skill at the right time.

Esin Saribudak • May 18

100%! That seems to be where the really interesting engineering work is heading. Thanks for checking out my post!

Varsha Ojha • May 21

Absolutely. This is where agents become more of an orchestration problem than a feature count problem. Having 20 skills sounds powerful, but the real value comes from knowing which skill to use, when to load it, and how to avoid wasting context on irrelevant tools. That’s probably where a lot of agent reliability work will happen next.

Yamashita Sadao • May 14

Excellent point.

It feels like this is where agent engineering becomes less about model capability and more about architecture discipline — especially around skill discovery, retrieval relevance, and execution constraints.

I’m curious: in your view, will the long-term solution lean more toward dynamic skill routing/orchestration, or do you think we’ll see tighter specialization with smaller purpose-built agents instead?

Esin Saribudak • May 14

Thanks for reading! And yes, couldn't agree more that agent engineering is becoming a discipline unto itself. The model capability is already there, but the surrounding environment or "harness" is where we're increasingly spending our time and effort.

To your question, I think it's a little tough to say with how quickly things are changing right now, but as the models get better and harness engineering matures, I think we'll probably move more toward dynamic skill routing: one agent loaded with the right skill at the right time. What are your thoughts?

Ifeanyi O. AWS • May 14

This was soo timely! I need to try it out in depth too. Thanks for sharing!!

Esin Saribudak • May 14

Thanks for checking it out, Ifeanyi!