Quick note: This article was first published on my Substack publication -> VSL
I had three browser tabs open: AI Studio, ChatGPT, and my code. Copy, paste, refresh, repeat. I was coding by committee and learning nothing.
Every time I needed help debugging, I'd switch tabs, explain the problem, get code back, paste it into VS Code, test it, find new bugs, then go back to the chat. I spent more time switching contexts than actually building.
Right now, vibe coders are stuck between two bad options:
- pick an expensive platforms like Lovable that charge $25-$200/month and force you to use their code editor;
- juggle separate AI interfaces that don't integrate with where you actually work.
Today I’ll show you the third and better way.
How I Started Coding Features In Yahini
I'd write a prompt in ChatGPT and Gemini, get code, copy it, paste into VS Code, test, find bugs, go back to the chat, explain the error, get new code, repeat. I'd spend 15 minutes just switching between tabs.
I even tried self-hosting an AI wrapper like Chatbot UI. One of those open-source chat interfaces you connect to your API keys to use multiple models in the same browser tab.
Spent a Saturday setting up Supabase, configuring environment variables, deploying to Cloudflare.
Got it working, had my own private ChatGPT/Sonnet/Gemini/Grok basically. But it was still a separate tab. Still copy-pasting code between the chat and VS Code. And now I had to maintain it. Database migrations, updates, auth issues. Gave up after two weeks because it didn't actually solve the problem.
These approaches keep AI separate from your code. With platforms like Lovable or Bolt, you don't even get to choose your AI model. They pick for you, mark it up, and you pay whatever they decide.
BYOK Changes Everything
Kilo Code is an open-source VS Code extension with specialized modes:
- Architect (plan before you code),
- Code (generate features),
- Ask (understand existing code),
- Debug (fix issues).
Instead of asking ChatGPT in another tab, Kilo Code sits next to your code like a pair programming partner. It can automatically reference files, use MCP servers and compress the context when it becomes too large.
But nothing beats the bring-your-own-keys (BYOK) functionality.
I use OpenRouter with Google Vertex credits. I have one API key in Kilo Code that points to OpenRouter. Then inside OpenRouter, I add all my provider keys. Anthropic, OpenAI, Google, whatever.
I create different model configs in OpenRouter and switch between them based on task complexity.
Simple task? Cheaper model. Complex refactor? Claude Sonnet. All through one key.
I also added a couple hundred dollars in OpenRouter credits to have a backup in case I run out of credits on other platforms. These credits also allow me to test specific new models like the Grok Code Fast 1 without depositing funds into individual providers.
Now think of this, Cursor and Lovable charge anywhere from $20 to thousands of dollars per month.
With Kilo Code, I pay $0 for the tool (it's open source). Then I only pay for whichever AI model I use at provider rates.
Last month: $119 in API calls for 117 million tokens.
Disclaimer: This includes some of Yahini’s usage besides coding. If I were to approximate I would say that I spent ~$50 for coding which includes a complete migration of the Yahini.io website from Remix to Astro.
Building an API Endpoint in 2 Hours
I needed an API endpoint for Yahini's lead magnet downloads. Captures email, queues it for processing, triggers email sequence. Built with Hono and Cloudflare Workers.
I opened Architect mode and said: "I need an API endpoint that handles lead magnet downloads with email capture and queue processing."
Architect mode (powered by Sonnet 4.5) created a to-do list:
- Set up POST endpoint with Hono routing;
- Add anti-spam and rate limiting middleware;
- Add request validation (email, resource ID);
- Queue message for email worker;
- Batch processing with delays;
- Error handling and retries.
Then I switched to Code mode. It followed the to-do list Architect created to complete each step. The Cloudflare documentation MCP was also a huge time saver. I used it to give Kilo Code access to all the Workers docs, queue setup, and batch processing patterns, making sure the AI won’t hallucinate too much.
Here's the simplified version of the worker:
import { Hono } from 'hono'
import { antiSpam } from './middleware/antiSpam'
import { rateLimiter } from './middleware/rateLimiter'
const app = new Hono()
const api = new Hono()
*// Apply anti-spam and rate limiting middleware*
api.use('/download-resource', antiSpam)
api.use('/download-resource', rateLimiter)
api.post('/download-resource', async (c) => {
const { email, resourceId } = await c.req.json()
*// Validation*
if (!email || !resourceId) {
return c.json({ error: 'Missing required fields' }, 400)
}
*// Queue email for processing*
await c.env.EMAIL_QUEUE.send({
to: email,
subject: 'Your Resource is Ready',
templateData: { resourceId },
requestId: crypto.randomUUID()
})
return c.json({ success: true })
})
app.route('/api', api)
export default {
fetch: app.fetch,
async queue(batch, env) {
console.log(`Received batch of ${batch.messages.length} messages`)
const delay = (ms) => new Promise(resolve => setTimeout(resolve, ms))
for (const message of batch.messages) {
await delay(50)
try {
await handleSendEmail(message.body, env)
message.ack()
} catch (error) {
console.error('Failed. Retrying:', error.message)
message.retry()
throw error
}
}
}
}
Validation, anti-spam, rate limiting, and queue processing. Nothing fancy, but production-ready.
Took 2 hours total to have a fully functional worker. I ended up integrating it in my dedicated Brevo Worker to keep things organized. And Kilo Code helped with that too.
Why Kilo Code Beats Platform Lock-In
The platform trap is real. Lovable charges $25/month minimum for 100 credits, but here's the catch: those credits burn fast. A simple "Add authentication" prompt costs 1.2 credits. Building a landing page? 2 credits. You're looking at 50-80 prompts per month before you need to upgrade.
Bolt isn't better. Their Pro plan starts at $20/month for 10 million tokens, but developers report burning through 7-12 million tokens just fixing simple errors. One user lost 1.3 million tokens in a single day. And those tokens don't roll over.
But the real problem isn't the cost. It's that you don't own anything.
With Lovable or Bolt, you're building inside their editor. Sure, Lovable added GitHub sync (which is huge), but you're still paying a monthly subscription just to access the platform. When credits run out mid-build, your work stops until you top up. When a new model drops that's perfect for your use case, you can't switch to it unless they support it.
And here's the part nobody talks about: Lovable Cloud. That's their backend service, powered by Supabase. It has completely separate billing. You get $25 free per month, but once your app needs real data storage, authentication, or file uploads, that meter starts running. People build prototypes thinking they're spending $25/month, then their app gets traction and suddenly they're getting surprise bills for backend usage they didn't budget for.
Kilo Code separates the tool from the cost.
The extension is free. Open source. You install it in VS Code (also free) or even in Cursor if you're already paying for that. Then you bring your own API keys.
Here's what that means in practice:
- Want to use Sonnet 4.5 for complex refactors? Pay Anthropic's API rates directly.
- Found a free model that's great for simple tasks? Use that.
- Got Google Vertex credits? Code for free until they run out.
- New model drops? Switch to it immediately.
No subscription. No credits that expire. No separate backend billing. Just pay for the AI you actually use, at the rates the providers actually charge.
Even if you're already paying for Cursor ($20/month), adding Kilo Code means you control which models you use and when. Cursor's credits burn fast when you're in Max Mode. With Kilo Code, you decide if a task needs an expensive model or if a cheap one will do. You're not locked into their usage tiers.
Quick Note on Credits
I'm running Kilo Code with free Google Vertex credits routed through OpenRouter, so I'm basically coding for free until those run out. I'll write a full post on how to set that up.
For now, just know BYOK means you can use whatever credits or pricing you want. Free model for learning? Go for it. Paid premium model for production? Your choice. You aren't locked into anyone's credit system.
This Week's Discovery
The Cloudflare Documentation MCP made building that endpoint smooth. Instead of opening docs in another tab, Kilo Code had all the queue syntax, Worker config, and routing info right there. Saved me 30 minutes of tab-switching. If you're building on Cloudflare, this MCP is essential.
https://github.com/cloudflare/mcp-server-cloudflare/tree/main/apps/docs-vectorize
Let's Connect
What's your setup? Juggling AI chat tabs? Paying for Cursor or stuck in Lovable? Reply and tell me.
Want to try Kilo Code with BYOK? GitHub repo: https://github.com/Kilo-Org/kilocode. If you want the full OpenRouter + free credits walkthrough, let me know. I'll write that in the upcoming editions
Own your tools. Control your costs. That's how you stop paying rent on your own work.
Subscribe to my Substack publication VSL and follow my journey where I show how I build Yahini and what tools I use!

Top comments (2)
Really insightful breakdown — especially the part about “coding by committee” and context switching. The BYOK + VS Code integration angle is a game-changer, because it shifts AI from being a separate assistant to an actual pair-programming partner inside your workflow. Your cost transparency and explanation of platform lock-in was refreshing too. Loved the Architect → Code workflow example — it makes the benefits very tangible. 🙌
Thanks a lot! Glad you appreciate the post and do give Kilo code a try, you'll be really surprised at it's capabilities.
Pro tip - it works even better with MCP servers like Context7 and EXA.