DEV Community

Cover image for How I Built a Production AI Chatbot (That Actually Handles Complexity)
Rizwanul Islam
Rizwanul Islam

Posted on • Originally published at portfolio-rizwanul.vercel.app

How I Built a Production AI Chatbot (That Actually Handles Complexity)

The 3 AM Problem
It was 3 AM when I received the notification: another customer had abandoned their booking on Gaari.

The pattern was becoming frustratingly familiar. Users would browse our car rental platform, find a vehicle they liked, and then... stop. Our analytics showed they were getting stuck at the same points—questions about insurance, pickup locations, driver requirements.

The solution seemed obvious: simple chatbot. But most "tutorials" build toy bots. I needed something that could handle:

Context: Remembering previous messages.
Real Data: Querying live car inventory.
Scale: Handling thousands of users without bankrupting me.
That's when I decided to build Gaariwala—an AI chatbot that now handles over 80% of our customer queries without human intervention.

The Architecture
Part 1: The Foundation (Next.js & Vercel AI SDK)
I used the Vercel AI SDK because handling streaming responses manually is a nightmare.

// app/api/chat/route.ts
import { openai } from '@ai-sdk/openai'
import { streamText } from 'ai'
export const runtime = 'edge'
export async function POST(request: Request) {
const { messages } = await request.json()
const result = await streamText({
model: openai('gpt-4-turbo'),
system: You are Gaariwala, a helpful assistant for Gaari...,
messages,
})
return result.toDataStreamResponse()
}
Part 2: Making It Smart (RAG)
The critical piece isn't the AI—it's the data. I built a RAG (Retrieval Augmented Generation) pipeline that checks the user's query against our Supabase database before answering.

If a user asks "Do you have SUVs?", the system:

Detects the intent (vehicle_search).
Queries the vehicles table in Supabase.
Injects the available car list into the System Prompt.
Part 3: Production Optimization
We quickly realized GPT-4 is expensive. I implemented a router:

Simple queries ("What are your hours?") -> GPT-3.5 Turbo
Complex queries ("Compare the sedan vs SUV for a trip to Sylhet") -> GPT-4 Turbo
This reduced our AI costs by 70%.

The Results
80% reduction in support tickets
3x faster response time (instant vs 10+ minutes)
24/7 availability without additional staff
I build systems that scale.

If you enjoyed this architectural deep dive, check out my "Digital HQ" where I showcase real-world production systems like Gaari and The Trail.

🌐 See the Architecture: https://portfolio-rizwanul.vercel.app/

Top comments (0)