This is a submission for the Built with Google Gemini: Writing Challenge
From Manual Chaos to Workflow Engineering: Building a Local-First AI Automation Pipeline (and Rethinking Cloud LLMs Like Gemini)
What I Built with Google Gemini
Over the past few months, Iβve maintained a daily LeetCode practice routine. While solving problems was straightforward, maintaining documentation wasnβt. Every solution required:
- Creating a new file
- Updating the README (sorted by difficulty)
- Writing structured explanations
- Formatting Markdown properly
- Pushing everything to GitHub
It was repetitive, manual, and error-prone.
To solve this, I built LeetCode AutoSync β a CLI automation tool that streamlines the entire workflow.
π GitHub Repo:
https://github.com/micheal000010000-hub/LEETCODE-AUTOSYNC
π My GitHub Profile:
https://github.com/micheal000010000-hub
π Blog Reflection:
https://dev.to/micheal_angelo_41cea4e81a/from-manual-chaos-to-workflow-engineering-automating-leetcode-with-ai-14n7
What the Tool Does
- Adds new solutions locally with automatic file structuring
- Updates and sorts README sections by difficulty
- Generates structured solution write-ups using an LLM
- Logs token usage and performance metrics to Excel
- Manages background inference using a producer-consumer queue
- Gracefully shuts down the model to free system resources
Where Gemini Fits In
While building this, I explored cloud-based LLM solutions like Google Gemini. I was particularly interested in:
- Structured content generation
- Reliable latency
- Deployment potential via Cloud Run
- API-based inference scalability
However, due to cost considerations and experimentation goals, I opted to implement a local-first architecture using Ollama and Mistral.
This decision itself became part of my learning journey β understanding the tradeoffs between:
- Local inference vs cloud APIs
- Cost vs scalability
- Latency vs resource consumption
- Infrastructure ownership vs convenience
Although this version of the project uses a locally hosted model, the architectural design was influenced by how cloud LLM APIs (like Gemini) structure prompts and manage inference workflows.
Demo
Hereβs a short walkthrough of the CLI automation tool in action:
π₯ https://www.loom.com/share/589028a173444af191f4788ff7f25a42
This project is a local-first CLI automation system designed to streamline my LeetCode workflow.
The demo shows:
- Adding a new solution interactively through the CLI
- Queue-based background LLM generation using a producer-consumer model
- Automatic README updates sorted by difficulty
- Structured Markdown solution generation
- Token usage and performance logging to Excel
- Thread-safe state management
- Graceful model shutdown after queue completion
The architecture includes:
- A background worker thread handling inference
- A synchronized task queue
- Local LLM inference via Ollama
- Structured file system updates
- Git integration for repository management
- Telemetry logging for token and performance metrics
Source code and implementation details are available here:
π https://github.com/micheal000010000-hub/LEETCODE-AUTOSYNC
What I Learned
This project taught me far more than just automation.
1. Concurrency in Python
I implemented a producer-consumer queue pattern:
- Main thread enqueues generation tasks
- Background worker processes LLM calls
- Thread-safe tracking using locks
- Graceful shutdown logic to avoid race conditions
This helped me deeply understand synchronization and state management.
2. Resource Lifecycle Management
Running local LLMs is expensive in RAM.
I implemented:
- Queue monitoring
- Automatic shutdown when queue empties
- Explicit model stop calls to free memory
This shifted my mindset toward infrastructure responsibility.
3. Telemetry & Observability
I extended the system to log:
- Prompt tokens
- Response tokens
- Total tokens
- Load duration
- Generation duration
- Tokens per second
This gave me insight into:
- Cold vs warm model loads
- Throughput efficiency
- Cost proxies for API alternatives
4. Tradeoffs: Local vs Cloud
Using a local model made me appreciate what cloud services like Gemini abstract away:
- No need to manage memory
- No manual lifecycle control
- Potentially better consistency
- Scalable deployment options
At the same time, local inference gave me:
- Cost control
- Full ownership
- Deep visibility into model behavior
That tradeoff analysis was one of the most valuable parts of this project.
Google Gemini Feedback
Although I did not integrate Gemini directly into this tool, I explored its ecosystem and documentation while evaluating architectural options.
What stands out:
- Clean API design
- Strong structured output capability
- Integration potential with Cloud Run
- Reduced infrastructure overhead
Where I would want more:
- Transparent token usage insights at a granular level
- Clear cost comparison tooling
- More documentation around performance benchmarking
If I extend this project in the future, I would experiment with:
- Swapping local inference with Gemini API
- Measuring performance differences
- Deploying the tool as a cloud-based service
Whatβs Next
This project started as workflow automation, but it evolved into something deeper:
- Understanding LLM systems design
- Learning concurrency patterns
- Measuring performance metrics
- Thinking like an infrastructure engineer
My next goal is to:
- Add multi-model comparison support
- Explore deployment options
- Integrate structured logging
- Experiment with cloud-hosted inference services
This project transformed my LeetCode practice from manual chaos into engineered workflow β and more importantly, it transformed how I think about AI systems.
Thanks for reading π
Top comments (0)