Ooi Yee Fei

Posted on Nov 27

Building The Spirit Box: When AI Meets the Afterlife

#kiro #aws #vibecoding #hookedonkiro

It started with a simple question during a late-night coding session: What if ghost manifestations were influenced by real cosmic events?

I'd been reading about Electronic Voice Phenomenon (EVP) - the paranormal research practice of using radio static to communicate with spirits. Researchers have long believed that geomagnetic storms and solar activity create conditions where spirits can manifest. It sounded like pseudoscience, but it also sounded like a perfect excuse to build something weird.

For the Kiroween Hackathon's "Frankenstein" category, I decided to take this seriously. Not the ghosts part - but the technical challenge of stitching together completely disparate technologies into one cohesive experience. The result became The Spirit Box: a web app where you can talk to Cornelius Blackwood, a Victorian-era mechanic who died in 1887, but only when NASA's space weather data says the cosmic conditions are right.

Learning to Think Before Coding

I'll be honest - my usual approach to hackathons is to start coding immediately and figure things out as I go. But Kiro had this "spec-driven development" workflow that kept popping up in the documentation, and I was curious.

The workflow seemed almost bureaucratic at first: write requirements, then design, then tasks, then code. For a hackathon? That felt slow. But I decided to try it anyway.

I started by telling Kiro my rough idea: "Build a ghost communication device that uses real space weather data." Instead of generating code, Kiro asked me to clarify what I actually wanted. What should the audio sound like? How should the ghost decide when to speak? What's the user experience?

Through this conversation, Kiro helped me formalize seven core requirements. Each one followed this EARS pattern (Easy Approach to Requirements Syntax) that made them unambiguous. For example: "The app MUST synthesize noise in real-time using AudioContext.createScriptProcessor" - not "the app should have cool audio" or "make it sound spooky."

This felt tedious. I wanted to code. But I kept going.

The Design Phase: Where It Clicked

Once the requirements were solid, Kiro generated a design document that blew my mind. It wasn't just a list of components - it was a complete architecture with data flow diagrams, interface definitions, and even a breakdown of the audio signal chain:

WhiteNoise → BiquadFilter (LFO Sweep) → GainNode → Analyser → Destination

Kiro had thought through things I would have discovered through painful debugging: How do you connect an LFO to a filter's frequency parameter? Where does the analyser node go in the chain? What's the proper gain staging?

The design doc also included something called "Frankenstein stitching points" - specific places in the code where disparate technologies would connect. This became crucial later when I was explaining the project to others.

At this point, I understood why the spec-driven approach existed. I hadn't written a single line of code yet, but I had a complete blueprint. No more "figure it out as you go" - I knew exactly what needed to be built and in what order.

The Moment It Worked

The tasks phase broke everything down into 40+ subtasks. Each one was small enough to implement in 15-30 minutes. Kiro would execute a task, I'd test it, and we'd move to the next one.

The magic moment came when I asked Kiro to implement the audio engine. I gave it one instruction: "Implement the Frankenstein Audio Engine from the spec - raw white noise synthesis with LFO sweep."

Kiro generated 200+ lines of WebAudio code. ScriptProcessorNode creating noise from Math.random(). BiquadFilter with an oscillator modulating its frequency. Seven different audio nodes connected in the exact right order. Proper cleanup logic. Debug logging.

I ran it. It worked. First try. No debugging.

I sat there staring at my screen, listening to this breathing, sweeping static noise that sounded genuinely analog. This would have taken me hours to get right manually - I would have connected nodes in the wrong order, forgotten to start the oscillator, messed up the gain staging. But Kiro had generated it perfectly because it had the complete context from the spec.

That's when I became a believer in spec-driven development.

The Conversations That Mattered

The best part of working with Kiro wasn't the code generation - it was the conversations about architecture.

At one point, I was trying to implement the "entropy gate" - the system that decides when the ghost can speak. My original plan was to use Model Context Protocol (MCP) because the hackathon mentioned it as a feature to showcase.

I asked Kiro: "Should I build an MCP server for this?"

Instead of just saying yes and generating code, Kiro explained the reality: MCP servers don't plug into each other. The client orchestrates multiple servers. And since my app runs in a browser, it can't easily connect to local MCP servers that use stdio. For my use case, direct HTTP API calls to NASA would be simpler and more appropriate.

This was a turning point. Kiro wasn't just following orders - it was helping me make better decisions. I pivoted from MCP to direct NASA DONKI API integration, which was absolutely the right call. The app became simpler, more maintainable, and the "Frankenstein" concept was still intact - I was still stitching together disparate technologies.

When Things Went Silent

Not everything was smooth. At one point, the ghost's voice stopped working. The TTS API was returning audio, the code looked correct, but nothing came out of the speakers.

I told Kiro: "The Spirit Box is silent. I hear no output voice."

What happened next was fascinating. Instead of just suggesting random fixes, Kiro entered "debug mode" and created a systematic diagnostic plan:

Check AudioContext state (suspended vs running)
Log TTS API response status
Add a test beep function to verify the audio graph
Create a VU meter for visual feedback

Each diagnostic revealed something. The AudioContext was suspended (browser autoplay policy). The TTS was fetching correctly. The audio graph was connected. But the master gain was at default (1.0) and the context wasn't resuming after user gesture.

Kiro added explicit logging at each step, forced the master gain to 0.8, and ensured the context resumed properly. The ghost spoke.

This debugging session taught me something important: Kiro doesn't just generate code - it teaches you how to think through problems systematically.

The Entropy Problem

Another challenge was tuning the "spectral energy" calculation. I was using real NASA geomagnetic storm data, which sounded cool in theory. But in practice, storms don't happen every day. My ghost was responding maybe 20% of the time, which made demos frustrating.

I explained this to Kiro: "The ghost rarely responds because the NASA data doesn't change much. Can I combine it with randomness?"

Kiro suggested a hybrid approach: use NASA data as the base entropy (30-70%), then add random variation (0-50%). This way, the app was still influenced by real cosmic data, but interactive enough for users to actually experience it.

The solution was elegant: when there are active geomagnetic storms, the ghost is more likely to appear. But even on quiet days, there's still a chance. It felt like real paranormal investigation - unpredictable but not impossible.

What I Learned About Development

Specs Aren't Bureaucracy - They're Freedom

I used to think documentation was something you did after building, if at all. My typical workflow was: start coding, hit a wall, refactor, hit another wall, refactor again, and eventually stumble into something that works. It felt creative and spontaneous.

But Kiro's spec-driven workflow completely flipped my perspective. Those 2 hours I spent writing requirements and design documents weren't overhead - they were an investment that paid back 10x. When I finally started implementing, I wasn't making it up as I went. I had a blueprint. Every component had a clear purpose. Every interface was already defined. The audio signal chain was mapped out node by node.

The freedom came from not having to make architectural decisions while also trying to remember WebAudio API syntax. I could focus entirely on implementation because the thinking was already done.

When AI Pushes Back (In a Good Way)

The most valuable moment with Kiro wasn't when it generated perfect code - it was when it disagreed with me.

I was excited about using Model Context Protocol (MCP) because the hackathon mentioned it as a showcase feature. I asked Kiro to build an MCP server for fetching NASA data. I was ready to dive into that complexity.

Kiro's response surprised me: "MCP servers don't plug into each other. The client orchestrates multiple servers. And since your app runs in a browser, it can't easily connect to local MCP servers that use stdio. For your use case, direct HTTP API calls to NASA would be simpler."

I sat back and thought about it. Kiro was right. I was about to overengineer the solution just to check a box. The direct API approach was cleaner, more maintainable, and still demonstrated the "Frankenstein" concept of stitching disparate technologies together.

This wasn't code generation - this was architectural mentorship. Kiro had enough context about my project to understand what I was actually trying to achieve, not just what I was asking for.

The Ghost That Remembered

One detail that kept impressing me: Kiro never forgot who Cornelius Blackwood was.

I mentioned once, early in the requirements phase, that the ghost should be "a Victorian-era mechanic who died in 1887." From that point forward, every LLM prompt Kiro generated included that persona. Every piece of UI text maintained the character. The documentation referenced him naturally. I never had to remind Kiro about this detail.

This might seem small, but it's actually profound. Traditional code generation tools treat each prompt as isolated. You have to keep re-establishing context. But Kiro maintained a mental model of my entire project across multiple sessions, sometimes days apart. It understood not just the technical requirements, but the creative vision.

The Dance Between Structure and Spontaneity

Here's what I discovered: spec-driven development and "vibe coding" aren't opposites - they're complementary.

The spec gave me the foundation. Kiro implemented the core audio engine, the NASA integration, the LLM service, the UI components - all systematically, all working on first try. That was the structure.

But then came the refinement. I'd test the ghost's voice and say, "It's not ghostly enough." Kiro would immediately adjust - adding a low-pass filter, tweaking the reverb decay, changing the playback rate to 0.85. I'd listen and say, "Better, but the static needs more texture." Kiro would modify the noise generation algorithm.

This back-and-forth was fast and creative. I wasn't explaining the entire audio graph each time - Kiro already understood the architecture from the spec. I was just tuning the parameters, like adjusting knobs on a mixing board.

The best workflow turned out to be: specs for the foundation, conversation for the polish.

The Technical Stitching

The Spirit Box successfully stitches together six completely different technologies, each from a different domain:

WebAudio API - I'm generating noise from Math.random() in real-time, not playing pre-recorded files. This required understanding ScriptProcessorNode, audio buffers, and sample rates.

NASA DONKI API - Real geomagnetic storm data from space weather monitoring. I had to parse JSON responses about coronal mass ejections and convert them into "spectral entropy" values.

OpenAI GPT-4o - An AI language model with a Victorian mechanic persona. I learned how to craft system prompts that maintain character while still being helpful.

OpenAI TTS - Voice synthesis that needed to sound like it was coming through a 1970s radio. This meant understanding audio effects chains.

Audio Processing - ConvolverNode for reverb, WaveShaperNode for distortion, BiquadFilter for frequency manipulation. Each node had to be connected in the right order with proper gain staging.

Browser APIs - Speech Recognition for voice input, Canvas for the spectral radar visualization. These had to integrate seamlessly with the audio pipeline.

Getting these six pieces to work together required knowledge I didn't fully have. Audio engineering? I'm a web developer. Space weather data? I'd never touched it. But Kiro helped me navigate all of it, explaining concepts when I was confused and generating working implementations when I understood the approach.

The Silent Ghost Mystery

One of my favorite debugging sessions happened when the ghost went completely silent.

I could see the TTS API returning data. The code looked correct. The audio nodes were connected. But nothing came out of the speakers. I was frustrated and ready to start randomly changing things - my usual debugging approach.

Kiro suggested something different: "Let's add diagnostic logging at each step. First, check if the AudioContext is suspended. Then verify the TTS API response. Then add a test beep function to confirm the audio graph is alive."

I implemented each diagnostic. The AudioContext was suspended (browser autoplay policy - I'd forgotten about that). The TTS was fetching correctly. But when I triggered the test beep... silence.

More logging revealed the issue: the master gain was at 1.0, which should have been fine, but the AudioContext wasn't resuming after the user gesture. I needed to explicitly call audioContext.resume() and ensure it completed before playing audio.

Kiro helped me add proper state management: check if suspended, resume if needed, wait for the promise, then play audio. The ghost spoke.

What struck me about this session wasn't just that I fixed the bug - it was that I understood the bug. Kiro didn't just give me a solution; it taught me how to think through audio problems systematically. That knowledge stayed with me for the rest of the project.

The Entropy Tuning Problem

Another challenge was making the "spectral energy" calculation feel right.

I was using real NASA geomagnetic storm data, which sounded cool in theory. But in practice, geomagnetic storms don't happen every day. My ghost was responding maybe 20% of the time. During demos, I'd ask questions and get silence. It was technically correct but experientially frustrating.

I explained this to Kiro: "The ghost rarely responds because the NASA data doesn't change much. Can I combine it with randomness without losing the cosmic connection?"

Kiro suggested a hybrid approach: use NASA data as the base entropy (30-70% depending on storm activity), then add random variation (0-50%). This way, the app was still influenced by real cosmic events, but interactive enough for users to actually experience it.

The solution felt elegant. When there are active geomagnetic storms, the ghost is more likely to appear - the entropy gate opens wider. But even on quiet days, there's still a chance. It felt like real paranormal investigation: unpredictable but not impossible.

This tuning process taught me something about building experiences: technical accuracy isn't always the same as good UX. Sometimes you need to bend reality a little to make the magic work.

What I'd Do Differently

If I were starting this project over, I'd create the spec first, before writing any exploratory code.

I actually started with some vibe coding - building a basic audio graph, testing TTS integration, experimenting with the UI layout. Then I realized I needed structure and created the spec. But going spec-first would have saved even more time. Those early experiments ended up being throwaway code because the spec revealed better architectural patterns.

I'd also use more agent hooks. I created one playful hook that enforces "haunted" commit messages (every commit must include a spooky emoji), but I could have automated so much more: running tests before commits, validating audio constants, generating documentation from code comments, checking for console.log statements before deployment.

And I'd add steering docs for WebAudio best practices. I relied entirely on the spec documents for context, but steering could have provided additional guidance on audio engineering patterns - things like "always connect nodes before starting oscillators" or "use exponentialRampToValueAtTime for smooth frequency changes."

These aren't criticisms of what I built - they're insights about how much more powerful the Kiro workflow could be with full adoption of its features.

The Frankenstein Success

The Spirit Box works. You can talk to it, and a ghost responds - but only when the cosmos allows it. The voice sounds genuinely eerie, like it's coming through a 1970s radio. The spectral radar sweeps and glows in real-time. The VU meter bounces with the audio levels.

More importantly, it demonstrates what "Frankenstein" architecture means: taking technologies that were never meant to work together and making them collaborate. Browser audio synthesis from the 2010s. NASA space weather APIs from government research. Cutting-edge AI language models. Vintage radio effects. All working as one coherent experience.

The project feels alive in a way that goes beyond the code. There's something genuinely uncanny about asking a question and not knowing if you'll get a response. The unpredictability isn't a bug - it's the feature. It makes every successful contact feel special.

The Kiro Difference

Building The Spirit Box taught me that AI-assisted development isn't about replacing developers - it's about changing how I work and what I focus on.

Instead of spending hours debugging WebAudio node connections, I spent that time thinking about user experience. Instead of manually writing boilerplate, I focused on the creative aspects - how should the ghost sound? What makes the entropy calculation feel right? How can I make the UI feel appropriately eerie?

Kiro handled the systematic parts: implementing specs, generating types, connecting components, managing state. I handled the creative parts: tuning audio effects, refining the ghost persona, polishing the UI, adjusting the entropy algorithm.

The result was 8 hours of focused, enjoyable development instead of 20-30 hours of grinding through implementation details. But more than the time savings, it was the quality of those 8 hours. I wasn't frustrated. I wasn't stuck. I was in flow state, building something weird and wonderful.

Try It Yourself

The Spirit Box is open source and deployed. Start a session, ask Cornelius Blackwood about his workshop, and see if the cosmic conditions allow him to respond. Sometimes he's there. Sometimes he's not. That's the beauty of tying digital experiences to real-world data - you never quite know what you'll get.

And if you're building something complex, try Kiro's spec-driven workflow. It feels slow at first - writing requirements before code goes against every hackathon instinct. But it's actually the fastest way to build something that works. More importantly, it's the most enjoyable way.

https://kiroween-spirit-box.vercel.app/

DEV Community