Demo
Click the image to watch the demo video.
Background
This project was built for Kiroween, a Halloween-themed hackathon hosted by AWS celebrating their AI coding agent Kiro. The challenge: build something creative with Kiro in under a month.
For me, it was truly fumbling in the dark—my first online hackathon, no video production experience, working solo in Japanese while submitting everything in English.
What Did I Build?
MairuCLI is an educational CLI wrapper that detects and blocks dangerous commands.
"Mairu" (参る) is a Japanese word meaning "to be troubled" or "to be defeated." It's also a wordplay on "Kiro." When you enter a dangerous command in the CLI, a Halloween-themed warning appears and teaches you why it's dangerous.
Key Features:
- 🔥 11 Dangerous Pattern Detection —
rm -rf /,chmod 777,dd,DROP DATABASE, fork bombs, and more - 🎃 Halloween-Themed Warnings — ASCII art and educational messages
- 📚 Educational Breakdowns — Explains what each part of a command means and simulates what would happen if executed
- 🏆 Achievement System — Gamification makes learning fun
- 🚂 Typo Entertainment — Type
sland a steam locomotive appears - 🛡️ System Directory Protection — Protects system folders on Windows/Linux/macOS
Technical Achievements:
- 300+ automated tests (100% passing)
- 16 Steering files
- 6 complete Specs
- Cross-platform support (Windows/Linux/macOS)
Devpost: https://devpost.com/software/mairucli
GitHub Repository: https://github.com/anestec-rd/MairuCLI
Demo Video: https://www.youtube.com/watch?v=OkKqJ8Shgxc
Barriers
Here are the barriers I faced:
- No online hackathon experience — I had no idea what to submit or how I'd be evaluated
- No video posting experience — I didn't even know how to create a demo video
- Japanese (not fluent in English) — All submissions had to be in English. README, video narration, everything
- Busy schedule — Development squeezed between my day job. Many days I couldn't find solid blocks of time
- Solo development — While this is company work, everything except UAT was done by myself alone
I would overcome these barriers together with Kiro.
14-Day Overview
The development of MairuCLI took approximately 30 hours.
To be precise, I spent another 2–3 days on Devpost submission preparation, fixing bugs discovered afterward, and writing this article.
Day 1–2: Foundation Building (~6 hours)
In the first 6 hours, most of the core functionality was complete. 8 built-in commands, 5 dangerous pattern detections, Halloween-themed display system.
The original estimate was 18 hours, but it finished in one-third of that time.
This was the first moment I felt the power of Spec-Driven Development. Because the requirements were clear, there was no architectural rework.
Day 3: The Miraculous 20-Minute Refactoring
This day was the turning point.
Refactoring a 400-line monolithic display.py into 7 modules.
Estimate: 2.5–3 hours. Actual: 20 minutes.
7.5–9x productivity improvement. This was the true value of Kiro and Spec-Driven Development.
Day 4–5: Content Expansion and Testing (~4 hours)
Implementation of the 3-tier warning system (Critical/Caution/Safe), adding IT wordplay, creating 35 automated tests.
This is when I changed "Not today, Satan!" to "Not today, SATA!" — the name of a storage interface with nearly identical pronunciation. Culturally neutral and funny to tech people.
Day 6–7: System Protection and Major Expansion (~5 hours)
Implementation of system directory protection. Detecting and blocking system folders for Windows, Linux, and macOS respectively.
Expanded built-in commands from 11 to 20. Split builtins.py (719 lines) into 7 modules.
Day 8–9: Educational Content and Bug Fixes (~4 hours)
Category-based variation system, generic typo detection, educational breakdown Spec creation.
Discovered and fixed Windows character encoding issues. locale.getpreferredencoding() was the savior.
Day 10: Critical Security Bug (30 minutes, but the most important day)
mkfs /dev/sda wasn't being detected on Linux. The command actually attempted to format the disk.
Saved by permission denied, but if I had run it with sudo...
AI can label things as "dangerous," but it doesn't truly understand what danger means.
This was the most important lesson I learned from this project.
Day 11–12: Educational System and Psychological Safety (~5 hours)
Implementation of the educational breakdown system. Explanation of each part of commands, timeline simulation, real incident stories.
And I discovered the concept of "Personarrative." Personal + Narrative. The continuity of personal context and story in AI collaboration.
Day 13–14: Final Adjustments and Documentation (~4 hours)
Test case expansion, macOS test fixes, repository documentation organization, Kiro workflow documentation.
300+ tests passing at 100% on all platforms.
What I Felt
Coding with Kiro was like pair programming.
Other AI agents like Claude Code have also become highly accurate. Honestly, I can't declare "Kiro is No.1!" However, as a clear point of comparison: if the excellence of other tools is like outsourcing to external professionals, then Kiro's excellence is like pair programming with a trusted colleague.
Whether it's due to the guidance from requirements/design phases through Spec coding, the customization elements through Steering and Hooks, or Kiro's well-balanced orchestration — I don't know the reason.
But you can see that debating which tool is superior is futile.
Should you feel secure in being able to delegate, or should you feel secure in being able to intervene? That's different for everyone.
Points of Concern
While using Kiro, some questions arose:
-
Limits of Steering — If you create 100 or 1000 Steering files, will they all be followed?
- You can't escape context window limitations after all
- Will we need Steering files to control Steering files?
Limits of AI "Understanding" — As I painfully learned on Day 10, AI can label things as "dangerous" but doesn't truly understand what danger means. This is a challenge common to all AI, not just Kiro.
These aren't criticisms, but challenges to address as we deepen AI collaboration.
Exciting Points
AI Agents Are Bad at ASCII Art
I thought that to make CLI feel reasonably fun, beyond gamification and colorful displays, visual expression through ASCII art was necessary. Emoji can achieve a certain quality, but they fall short in terms of impact.
I knew how amazing recent image generation has become. DALL-E3 can easily draw "an engineer screaming in terror." So I thought generating images with text characters would be just as easy.
However, when I tried to make ASCII art of a man screaming in terror...
,.,. ,.,.
,d$$$$$$$$$$$b.
,d$$$$$$$$$$$$$$$b.
d$$$$$$` `'$$$$$$$b
d$$$$$ .---. '$$$$b
d$$$$$ / _ \ |$$$$b
d$$$$$| | (O) | |$$$$$b
$$$$$$| `.___.' , |$$$$$$
_$$$$$$| ; : |$$$$$$_
`$$$$$; / \ ; $$$$$$'
`$$$$$b. `-' .d$$$$$$'
`$$$$$b.....d$$$$$$'
`$$$$$$$$$$$$$$'
`"$$$$$$$$"'
..::;d########b;::..
.:::::;##########;::::::.
.::::::;##########;:::::::.
'::::::'##########':::::::'
Compared to image generation, it's undeniably cheap...
ASCII art — drawing pictures with characters — is still in its developmental stage compared to image generation. There's probably not enough training data.
In the end, I decided to create the jack-o'-lantern by hand. It took an hour. A lesson that there are tasks that can't be left to AI.
Video Creation Makes Time Vanish!
The development period was approximately 30 hours. However, I spent 12 hours afterward on Devpost submission preparation. 8 of those hours were video production.
Compared to what Kiro accomplished, it's trivial — but that's what manual human work is like.
The original plan was about 2 minutes 50 seconds, but it ended up as a video under 100 seconds.
Video editing, voice synthesis, BGM selection — once I started being particular, I couldn't stop.
...Perhaps efficiency exists precisely to extend this time for obsessing over details by even one second.
Emergency Ghostbusters
Just when I thought "Yes! I can release the 'product'!" and felt relieved, Kiro and I encountered a real ghost — while casually operating the completed MairuCLI, the detailed explanation section suddenly vanished!
Until then, pressing "rm -rf /" should have shown a frowning man and displayed "Would you like a detailed explanation of why this is dangerous?"
"Kiro, investigate!"
When he tried again following my instructions, it displayed correctly this time.
"Maybe old version information was left behind?" he replied, but something didn't sit right.
It happened sometimes and didn't happen other times. Sometimes even executing help would hide the command list! What a mischievous ghost!
When we investigated, the bug was caused by cd (Change Directory). As mentioned above, Kiro had separated the breakdown and help display files from the main file through refactoring. However, those references were specified with "relative paths." This meant the file positions shifted due to cd's influence.
It was exorcised by specifying "absolute paths," but that alone wouldn't guarantee the ghost wouldn't reappear when creating other features. So we decided to seal it by posting an ofuda (talisman). A Steering file as an ofuda.
This bug was a bit difficult for Kiro alone. Testing that combines multiple operations can't always be done within limited time. We must consider patterns that humans can anticipate.
AI might create infinite test cases, but determining whether they're correct and where outside the box lies is left to humans.
For detailed lessons from this development, I've uploaded them to GitHub, so please read them if you're interested.
AI-Driven Development Lessons
Unexpected Contributions
When releasing the video, I borrowed various technologies from predecessors. For impactful image videos, I borrowed the power of Google's Veo, and Kiro crafted the English script for what I would say. For music, I utilized abcjs.
Now, there was the question of who would speak. There's no one fluent in English in my office. At that time, I had once searched extensively through various features OpenAI had released for multimodal capabilities.
Among them, I had experimentally created a feature that converts input text to speech. After the verification work, I never had an opportunity to use it.
And now, of all times, it began to shine.
You really never know what will happen. It's best to keep things around.
What I was doing was just selfishly adding content or moving things around.
Regrets
Just one thing — I was being stingy.
For these two weeks, I kept using Auto mode. At the time, I had a strong awareness that Opus costs were very high.
I kept worrying about what would happen if I hit the limit mid-creation.
However, this thinking had two problems.
First, Kiro Pro+, perhaps due to the developers' generosity, had very generous credits compared to the cost. For small-scale development, even using Opus would work out, and if something that would take 3 or 4 attempts with Sonnet could be done in 1 attempt, that would be better.
Second, even if I used it all up, what reason was there not to add more credits?
I understand the desire to manage within the service's scope. But compared to the deliverables and my labor costs, and the quality of experience in this golden opportunity called Kiroween — what does a mere $20–100 matter?
If you realize how much you're being paid per hour, holding back shouldn't be an option. MIN is fine normally. But when you should bet MAX, bet MAX.
Leaving Evidence
When I decided to participate in the hackathon, I consciously decided to "leave data on when, what operations were done, and for how long."
I had learned through my working life that memories fade and conveniently change.
I wrote daily summaries, recording from when to when, how many hours spent achieving what, and whether there were work interruptions. This isn't just a record — it's also evidence of AI collaboration. Not everyone is energetic enough to create reports from scratch.
In the early stages, I got inaccurate times with wrong dates. A common AI error. Learning from this failure, I independently created a Steering file called "time-tracking." I made it record the current time after work (not having AI do it, but getting the time with a dedicated Python program).
Daily Report Folder:
Daily Report
One of Kiro's excellent points is that Spec files become development evidence as-is. From requirements definition to design to task lists, everything remains in the repository. "How it was made" is automatically documented.
That's why retrospectives are easier, and you can discuss insights gained from them. It's not just individual exclusive experience. Team coordination should be much easier too.
GitHub has 14 days of summaries, 16 Steering files, and 6 complete Specs. These aren't just documents — they're the story of how AI and humans collaborated.
Finally
We were two halves of a whole.
Through this, I learned that what's important when using AI is "understanding deficiency" and "understanding perspective."
AI is excellent, but it can't understand its own deficiencies. It only starts moving when given the context it lacks. Humans are the same. As the phrase "knowing that you don't know" suggests, improvement comes from grasping where you're deficient.
The human strength is being able to subjectively (or arbitrarily) decide what constitutes deficiency — and that's what we call decision-making.
This might overlap with what I've said, but this connects to how understanding what each party can and cannot see determines project success or failure.
For small-scale development like this, it's still manageable, but in large-team development, acting freely and modifying features at will is not permitted.
There will be cases where you need to create barriers as needed, and cases where you need to communicate your situation to others.
I felt that Kiro's unique features — Spec, Steering, Hooks — would come alive in those explanations.
Anyway, the difficult yet enjoyable journey with Kiro has safely concluded. It was truly fumbling in the dark, but the starlight ahead was warm.
By the way, he still hasn't forgotten to add the 🎃 emoji at the end.
GitHub: MairuCLI
Demo Video: MairuCLI: Educational CLI Wrapper with Halloween Warnings | Built 100% with Kiro AI
Devpost: Kiroween Submission




Top comments (0)