TL;DR
Kiro CLI has a feature called "skills" — domain-specific expertise packages you can load on demand. I loaded the aws-cloudformation skill to validate my production Spot Fleet template. It found 4 critical issues including a CloudWatch alarm that literally never worked.
Situation
I run a development environment on AWS Spot instances — Kiro CLI, VS Code Server, persistent EBS, the works. The CloudFormation stack has been deployed since February 2026. It's been updated once, runs daily, and I considered it production-ready.
I was about to upgrade stack and wanted to validate my base template first.
Task
Verify whether my "security-hardened" template actually follows CloudFormation best practices before using it as the foundation for upgrade.
Action
Step 1 — Discover the skill:
search_documentation("CloudFormation deployment", topics=["agent_skills"])
This returned 5 relevant skills. I picked aws-cloudformation.
Step 2 — Load the skill:
retrieve_skill("aws-cloudformation")
What I got back wasn't generic advice. It was:
- A structured 10-step authoring checklist
- 3-layer validation pipeline (syntax → security → pre-deploy)
- Troubleshooting SOPs for failed stacks
- Decision frameworks for template vs environment fixes
Step 3 — Load the reference SOP:
retrieve_skill("aws-cloudformation", file="references/author-cloudformation-best-practices.script.md")
This gave me the detailed checklist: resource naming, parameter design, security defaults, deletion policies, cross-stack references, conditions, outputs.
Step 4 — Apply it to my template:
I asked Kiro CLI to review CFN yaml against the skill's checklist.
Result
4 critical, 5 recommended, 1 strict finding.
Here's what hit hardest:
🔴 Critical #1: Broken CloudWatch Alarm (3 months undetected)
yaml
HighCPUAlarm:
Dimensions:
- Name: InstanceId
Value: !Ref SpotFleetRequest # ← This is a Fleet ID, not an Instance ID
My "cryptomining detection" alarm was monitoring a dimension that doesn't exist. It never evaluated. Never fired. I had zero protection for 3 months while thinking I was
covered.
Fix: Reference the actual instance ID (requires a Lambda or instance self-registration pattern since Spot Fleet instances are dynamic).
What Are "Skills" Exactly?
They're not prompts. They're not RAG documents. They're structured domain expertise packages containing:
| Component | What it does |
|---|---|
| Workflows | Step-by-step procedures with decision points |
| Checklists | Deterministic validation (like the 10-step authoring review) |
| SOPs | Standard operating procedures for troubleshooting |
| Decision trees | "If X, do Y; if Z, do W" frameworks |
| Reference files | Architecture docs, schemas, examples |
The CloudFormation skill has SOPs for:
- Authoring with secure defaults
- 3-layer pre-deployment validation (cfn-lint → cfn-guard → change set)
- Troubleshooting failed stacks (using describe-events --filters FailedEvents=true)
- Resource property lookup
Key insight: The skill told me to use describe-events (newer API with filter support) instead of describe-stack-events (legacy, no filters). I didn't know this API existed.
How to Use Skills in Kiro CLI
bash
1. Search for relevant skills
(you can't guess skill names — they must be discovered)
search_documentation("your task description", topics=["agent_skills"])
2. Load the skill
retrieve_skill("exact-skill-name-from-search")
3. Load reference files if the skill links to them
retrieve_skill("skill-name", file="references/some-sop.md")
4. Apply — ask the AI to use the loaded expertise
"Review my template against the authoring checklist"
"Troubleshoot my failed stack using the SOP"
"Validate before I deploy"
Available skills I found:
- aws-cloudformation — authoring, validation, troubleshooting
- launching-ec2-instance-with-best-practices — secure EC2 launches
- creating-production-vpc-multi-az — VPC design
- creating-ec2-image-builder-pipeline — AMI automation
- aws-cdk — CDK patterns and deployment
Lessons Learned
"It works" ≠ "It's correct." My stack ran fine for 3 months with a broken alarm. Functional doesn't mean secure.
Validate infrastructure like you lint code. We wouldn't ship code without tests. Why do we ship CloudFormation without a structured review?
AI skills > AI chat. A general "review my template" prompt gives generic advice. A loaded skill applies a deterministic, comprehensive checklist. The difference is like
asking a random person vs. asking a CloudFormation specialist with a clipboard.The scariest bugs are silent ones. A broken alarm doesn't throw errors. It just... doesn't protect you.
The skill doesn't just find problems — it prevents them in new templates.
Top comments (0)