This is a submission for the Built with Google Gemini: Writing Challenge
Every great project starts with a spark, but the best developers know that the learning doesn't end when the deadline hits. My recent journey as a builder has been defined by two distinct projects that pushed my boundaries: a solo deep-dive into AI security, and a collaborative team build focused on developer productivity.
Here is a look back at what I built, the roadblocks encountered, and where the code is taking me next.
What I Built with Google Gemini
Project 1: Hiding in Plain Sight (Multimodal Steganography)
My first recent dive into Gemini was building a Python-based multimodal steganography application. Standard steganography conceals data within the least significant bits of an image, but if an attacker knows the algorithm, the secret is compromised. I wanted to build a system where the AI itself acts as the cryptographic key.
By integrating Gemini’s multimodal capabilities, the app requires the user to pass the "cover image" to the model. Gemini analyzes the visual context—identifying objects, mood, and specific details—to generate a dynamic, context-aware key. To retrieve the hidden message, the system requires not just the altered image, but Gemini's exact interpretation of it.
Project 2: Copilot CoLab (VS Code Extension)
While AI is incredible for security, it is equally powerful for workflow orchestration. Most recently, I teamed up with Nabil and Bhumi to build Copilot CoLab, a real-time team collaboration extension for VS Code. Developers lose countless hours context-switching between their IDE, Slack, and Jira. We brought tasks, chat, and presence directly into the editor.
As the frontend lead (while also contributing to the backend), I built the interface that ties these features together. We integrated Gemini to act as an embedded project manager. By pinging @gemini in the team chat, the model can automatically generate a full Work Breakdown Structure (WBS) for a new feature or perform AI-powered bulk task assignments to team members based on the repository's context.
Demo
You can check out the source code for both projects here. (Tip: I highly recommend embedding a quick 30-second Loom video or a few high-quality screenshots of the CoLab UI and the Steganography terminal output right here before you publish!)
Multimodal Steganography: https://github.com/Aman0choudhary/Project-1
Copilot CoLab: https://github.com/n4bi10p/copilot-colab
What I Learned
Between juggling my responsibilities as a college Cloud Lead and pushing through late-night study sessions for OS and PPS exams, these projects forced a massive evolution in how I write software.
Technical Breadth: The steganography app required a deep dive into Python's byte-level file manipulation. Copilot CoLab was a completely different beast: it required mastering the VS Code Webview API, bridging frontend states with extension host commands, and keeping everything synced in real-time using Supabase.
The Shift from Solo to Lead: Leading the frontend for a team meant I couldn't just build in a silo. I had to clearly communicate UI constraints to the backend, document my logic, and iterate based on Nabil and Bhumi's feedback. It taught me that code readability and clear communication are just as important as the logic itself.
The Macro vs. Micro Perspective: Building the steganography app required thinking small—literally down to the least significant bit of a single pixel. Building Copilot CoLab required thinking big—about human behavior and how teams actually communicate. Great architecture requires respecting both ends of that spectrum.
Google Gemini Feedback
The Good:
The Google AI Studio interface is phenomenal for rapid prototyping. Being able to drag and drop images and tweak my prompts for the steganography app before writing a single line of Python saved me hours of API debugging. For Copilot CoLab, the speed of the gemini-1.5-flash model was a massive win; it parsed project contexts and assigned tasks incredibly fast, making the @gemini chat feel like a truly real-time teammate.
The Friction (The Bad and the Ugly):
The biggest hurdle was forcing a generative model to act deterministically. Getting Gemini to output the exact same key format every single time for the security app—or perfectly formatted JSON for Copilot CoLab's bulk task assignment—required heavy prompt engineering. In the early stages, the model would sometimes over-explain (e.g., adding conversational fluff or wrapping outputs in markdown blocks), which completely broke our parsers. We had to learn how to aggressively constrain the prompts and implement strict JSON parsing on our end to filter out the noise.
Looking Forward
Working on these tools showed me how powerful AI can be when applied to real-time human connection and secure verification. Currently, I'm conceptualizing a hyperlocal social discovery mobile app for students and professionals in Pune, focusing on matching people based on shared interests. I am already brainstorming how to implement Gemini into the backend of this new app—perhaps using multimodal logic to verify student IDs or dynamically match users based on their portfolios.
The hackathons might be over, but the builder's momentum is just getting started.
Top comments (0)