Remember punch cards? Those paper cards with holes that programmed computers in the 1960s? I brought them back as an AI-powered learning tool for kids.
This is how I built PunchCard.AI using Kiro - from requirements to deployment.
Why Punch Cards?
Programming used to be physical. You'd arrange cards in sequence, each card doing one thing. It was tactile, visual, and made you think step-by-step.
Modern programming is powerful but intimidating. Kids see walls of text and syntax errors. What if we could bring back that tactile feeling with modern AI?
That's PunchCard.AI: drag visual cards to build programs, AI compiles them to real code, and you see actual results.
What PunchCard.AI Does
Pick your learning level (elementary to university). Choose a subject you love - or create your own. The AI generates a programming challenge just for you.
Then you build your solution by dragging cards:
- "Get User Input" card
- "Sort Numbers" card
- "Display Result" card
Click run. The AI compiles your cards to Python code and executes it safely. You see real output, real errors, real learning.
The AI gives you feedback that's encouraging and educational. No harsh "wrong answer" messages - just helpful guidance.
Building with Spec-Driven Development
I didn't jump straight into coding. I used Kiro's spec-driven approach:
Planning
- Wrote 10 user stories with acceptance criteria
- Used EARS syntax for clear requirements
- Mapped out the architecture
- Designed all service interfaces
- Created 11 correctness properties for testing
- Broke design into 19 actionable tasks
Implementation
- Executed tasks one by one
- Tests caught bugs immediately
- No major refactoring needed
- 19 of 19 tasks completed
The upfront planning felt slow at first. But it saved me from the usual chaos of "build first, fix later."
Steering Docs: Teaching Kiro to Be a Teacher!
The AI needed to generate educational content - challenges, feedback, card libraries. But AI can be inconsistent.
I created three steering documents:
ai-behavior.md: How to talk to students
- Age-appropriate language
- Encouraging tone (never discouraging)
- Specific, actionable feedback
- Educational explanations
card-patterns.md: How to design cards
- Six categories (INPUT, PROCESS, OUTPUT, etc.)
- Complexity ratings (1-5 scale)
- Subject-specific theming
- Consistent naming conventions
subject-generation.md: How to create diverse subjects
- No keyword overlap between subjects
- Balance across categories
- Age-appropriate examples
- Educational value validation
These docs transformed AI quality:
- Feedback consistency: 60% → 95%
- Challenge quality: 70% → 90%
- Subject diversity: 75% → 100%
- Inappropriate content: 5% → 0%
Kiro automatically includes these guidelines when generating content. The AI stays on-brand without me micromanaging every prompt.
Agent Hooks: My Silent QA Team
I set up two hooks that run automatically:
test-on-save.json: Runs all tests when I save a file
- Catches regressions instantly
- No manual test running
- Saved a lot of time!
lint-on-save.json: Fixes code style automatically
- Consistent formatting
- No style debates
These hooks caught 15+ bugs during development. Issues that would've made it to production were fixed immediately.
The hooks also validate requirements. When a test fails, it shows which acceptance criteria broke. Continuous validation throughout development.
MCP: The Secret Sauce
The core feature is real code execution. Students need to see actual Python output, not simulations.
Running arbitrary code in the browser? Dangerous. Building custom sandboxing? Complex.
Enter MCP (Model Context Protocol).
I configured the MCP code-executor server:
{
"mcpServers": {
"code-executor": {
"command": "npx",
"args": ["@modelcontextprotocol/server-code-executor"],
"env": {
"EXECUTION_TIMEOUT": "5000",
"MAX_MEMORY": "128MB"
}
}
}
}
- Safe sandboxed execution
- Real stdout/stderr output
- Timeout protection (5 seconds max)
- Memory limits (128MB)
- Multi-language support
Without MCP, I'd need to build custom infrastructure, worry about security, and maintain execution environments. With MCP, it's one config file.
Students see real code behavior. Real error messages. Real learning.
The AI Pipeline
Every feature uses AI differently:
Subject Generation: Gemini creates 6 diverse learning themes
- Robotics, art, science, games, data, web
- Or students create custom subjects
- Steering docs ensure diversity and appropriateness
Challenge Generation: AI creates unique programming problems
- Matched to subject and skill level
- Clear objectives and expected output
- Hints for when students get stuck
Card Library: Subject-specific programming cards
- "Move Robot Forward" for robotics
- "Play Musical Note" for music
- "Analyze Data" for data science
Code Compilation: Cards → Python code
- AI translates card sequence to executable code
- Adds comments explaining logic
- Handles edge cases gracefully
Feedback Analysis: AI reviews student solutions
- References specific cards used
- Explains what worked and what didn't
- Suggests improvements
- Always encouraging, never harsh
The entire learning experience is dynamic. No two students get the same challenges.
Property-Based Testing
I didn't just write unit tests. I wrote 11 correctness properties - rules that should always be true.
Examples:
- Challenge Uniqueness: No two challenges should be identical in a session
- Card Library Completeness: Library must contain cards that can solve the challenge
- Feedback Relevance: Feedback must reference the student's actual cards
- Subject Diversity: No keyword overlap between subjects
- Progressive Difficulty: Difficulty adapts based on performance
- Error Recovery: System maintains state without crashing
- ...and 5 more properties
I used fast-check to test these across 100 random inputs each (800 total iterations). Found edge cases I never would've thought of.
Final test coverage: 91.6%
Challenges I Overcame
AI Consistency: Free-tier Gemini API can be slow (10-15 seconds per generation). Solution: Added caching and clear loading states with helpful messages.
Property Test Flakiness: AI-generated content varies slightly. Solution: Adjusted tests to allow minor variations while maintaining correctness.
State Management: Complex state across 6 services. Solution: Clear interfaces and progress tracking kept everything synchronized.
The Tech Stack
- React 18 + TypeScript (strict mode)
- Vite for blazing fast builds
- Tailwind CSS for styling
- Google Gemini for AI generation
- MCP for code execution
- Vitest + fast-check for testing
- Local storage for progress tracking
Everything chosen for simplicity and speed.
What I Learned About Kiro
Spec-driven development works: way faster overall, higher quality.
Steering docs are powerful: Small guidelines create massive consistency. The AI stays on-brand without constant supervision.
Agent hooks are underrated: Automated testing and linting saved 3-5 hours per day. Bugs caught before they spread.
MCP is a game-changer: Complex features (code execution) become simple. One config file replaces weeks of custom development.
Property-based testing finds bugs: Testing universal properties catches edge cases that example-based tests miss.
The Results
PunchCard.AI is live with:
- AI-generated subjects and challenges
- Drag-and-drop card programming
- Real code execution via MCP
- Intelligent, encouraging feedback
- Progress tracking and metrics
- Four educational levels
- Enhanced Library Info panel with API usage stats and quick tips
- Helpful context about AI generation times
- Improved user guidance throughout the interface
Completely free and open source. No limits, no paywalls.
PS - "Wait times or API rate limits may occur because we are using free AI API!:)"
Try It Yourself
PunchCard.AI is live at https://punchcard-live.vercel.app/
Pick a level, choose a subject, solve a challenge. See how punch card programming feels in 2025.
Building this taught me that with the right approach (specs + steering + hooks + MCP), you can build complex educational tools without chaos.
The 1960s had punch cards. 2025 has AI-powered punch cards. The future of learning is surprisingly retro.
- Built with: Kiro IDE, React, TypeScript, Gemini AI, MCP
- Category: Resurrection (Kiroween 2025)
- Time: 16 days from idea to deployment
- Lines of Code: ~3,500 (plus 2,000 lines of tests)
- Test Coverage: 91.6%
- Bugs in Production: 0
Top comments (0)