DEV Community

Cover image for I Built an AI Chatbot Wrong (And What I Learned About Cloudflare's AI Search)
Daniel Nwaneri
Daniel Nwaneri

Posted on

I Built an AI Chatbot Wrong (And What I Learned About Cloudflare's AI Search)

Last week, I spent over two hours helping a client build an AI-powered chatbot for their wellness e-commerce site. I set up a sophisticated RAG system using Cloudflare Vectorize, wrote custom vectorization scripts, and carefully configured Workers AI bindings.

The client was pleased with the work. Everything functioned perfectly. Then, fifteen minutes after our session ended, he messaged me: "I just built the same thing using AI Search in 15 minutes."

I had over-engineered a solution that could have been 10x simpler. Here's what happened, what I learned, and how to choose the right approach for your project.

The Client's Problem

My client runs an e-commerce site selling wellness teas and supplements. He was spending 2-3 hours daily answering the same questions:

  • "What are the ingredients in LuluTox Detox Tea?"
  • "Will TeaBurn help with weight loss?"
  • "Are there scientific studies supporting these claims?"

He needed an AI chatbot that could:

  1. Answer questions 24/7 automatically
  2. Reference specific product information accurately
  3. Cite ingredients and scientific studies
  4. Handle hundreds of queries without his involvement

The goal: Reduce customer support time by 70% while maintaining answer quality.

My Approach: Manual RAG with Vectorize

I immediately thought: "This is a perfect RAG use case." I designed a system using:

  • Cloudflare Vectorize for vector storage
  • Workers AI for embedding generation
  • OpenAI GPT-3.5 for response generation
  • Custom Worker to orchestrate everything

The Implementation

Step 1: Create the Vectorize Index

npx wrangler vectorize create wellness-products \
  --dimensions=768 \
  --metric=cosine
Enter fullscreen mode Exit fullscreen mode

Step 2: Build Data Loading Script

export default {
  async fetch(request, env) {
    const products = [
      {
        id: "lulutox-detox-tea",
        text: "Product description with ingredients..."
      }
      // ... more products
    ];

    for (const product of products) {
      // Generate embedding
      const embedding = await env.AI.run(
        '@cf/baai/bge-base-en-v1.5',
        { text: [product.text] }
      );

      // Insert into Vectorize
      await env.VECTORIZE_INDEX.insert([{
        id: product.id,
        values: embedding.data[0],
        metadata: { text: product.text }
      }]);
    }

    return new Response("Data loaded");
  }
};
Enter fullscreen mode Exit fullscreen mode

Step 3: Configure Bindings

# wrangler.toml
[[vectorize]]
binding = "VECTORIZE_INDEX"
index_name = "wellness-products"

[ai]
binding = "AI"
Enter fullscreen mode Exit fullscreen mode

Step 4: Build Query Logic

// Generate query embedding
const queryEmbedding = await env.AI.run(
  '@cf/baai/bge-base-en-v1.5',
  { text: [userQuery] }
);

// Search Vectorize
const matches = await env.VECTORIZE_INDEX.query(
  queryEmbedding.data[0],
  { topK: 3, returnMetadata: true }
);

// Build context
const context = matches.matches
  .map(m => m.metadata.text)
  .join('\n\n');

// Send to OpenAI with context
const response = await openai.chat.completions.create({
  messages: [{
    role: "user",
    content: `Context: ${context}\n\nQuestion: ${userQuery}`
  }]
});
Enter fullscreen mode Exit fullscreen mode

Time Investment

  • Research and planning: 1 hour
  • Implementation: 2 hours
  • Debugging and testing: 1 hour
  • Total: 4 hours of development time

The Result

✅ Worked perfectly

✅ Full control over retrieval logic

✅ Could use any LLM (OpenAI, Claude, etc.)

✅ Highly customizable

❌ Complex for a simple use case

❌ More code to maintain

❌ Higher development cost

What the Client Found: AI Search

While I was writing documentation, my client was researching. He discovered Cloudflare's AI Search feature and rebuilt the entire system himself in 15 minutes.

How AI Search Works

AI Search is Cloudflare's auto-RAG solution. Instead of manually orchestrating embeddings, vector search, and LLM calls, it handles everything in a single API call.

The Complete Implementation:

export default {
  async fetch(request, env) {
    const { query } = await request.json();

    const response = await env.AI.run(
      '@cf/meta/llama-3.1-8b-instruct',
      {
        messages: [
          { role: "user", content: query }
        ],
        search: {
          index_name: "wellness-products"
        }
      }
    );

    return Response.json(response);
  }
};
Enter fullscreen mode Exit fullscreen mode

That's it. Around 30 lines of code total.

How to Set Up AI Search

1. Upload documents to Vectorize:

# Prepare your data as JSON
{
  "documents": [
    {
      "id": "product-1",
      "text": "Your product description..."
    }
  ]
}

# Upload via API or dashboard
Enter fullscreen mode Exit fullscreen mode

2. Create Worker with AI Search:
The code above is the complete implementation.

3. Deploy:

npx wrangler deploy
Enter fullscreen mode Exit fullscreen mode

Time Investment

  • Setup: 10 minutes
  • Testing: 5 minutes
  • Total: 15 minutes

The Result

✅ Worked perfectly

✅ Minimal code (30 lines)

✅ Built-in optimization

✅ Low maintenance

⚠️ Less control over retrieval

⚠️ Locked to Workers AI models

The Comparison

Feature Manual RAG AI Search
Development Time 4+ hours 15 minutes
Code Complexity High Low
LLM Choice Any (OpenAI, Claude, etc.) Workers AI only
Context Control Full control Automatic
Maintenance Manual updates needed Handled by Cloudflare
Best For Complex use cases Simple Q&A

When to Use Manual RAG (Vectorize)

After this experience, I've identified when manual RAG is the right choice:

1. You Need a Specific LLM

If you must use GPT-4, Claude, or a specialized model, manual RAG is your only option.

Example: Legal tech requiring Claude's longer context window.

2. Complex Retrieval Logic

When you need custom scoring, multi-stage retrieval, or metadata filtering beyond basic search.

Example: Multi-tenant SaaS where each user sees only their data, requiring complex filtering.

3. Advanced Use Cases

  • Real-time learning systems that update frequently
  • Hybrid search combining vector and keyword search
  • Custom embedding models for specialized domains
  • Performance optimization requirements

4. Compliance Requirements

When you need full control over data handling, storage, and processing for regulatory compliance.

Example: Healthcare applications with HIPAA requirements.

When to Use AI Search

AI Search is ideal for most common chatbot scenarios:

1. Simple Q&A Systems

  • Product support
  • Documentation search
  • FAQ automation
  • Customer service bots

My client's use case fit perfectly here.

2. Fast Development Needs

  • MVP/prototype
  • Tight deadlines
  • Limited resources
  • Proof of concept

3. Workers AI is Sufficient

When Llama 3.1 or other Workers AI models meet your quality requirements.

4. Small Teams

When you want to focus on business logic instead of infrastructure maintenance.

What I Should Have Done

Looking back, here's my mistake: I never asked the right questions.

The Questions I Should Have Asked:

  1. "Do you need to use a specific LLM, or is any capable model fine?"

    • His answer: "Any model that works"
    • This alone should have pointed me to AI Search
  2. "How complex are your retrieval needs?"

    • His answer: "Just find relevant product info"
    • Simple retrieval = AI Search
  3. "Speed to market or maximum flexibility?"

    • His answer: "I need this working ASAP"
    • Speed = AI Search
  4. "What's your technical team size?"

    • His answer: "Just me"
    • Small team = AI Search

Better Discovery Process

1. Understand the business problem
2. Assess technical constraints
3. Present multiple solutions with trade-offs
4. Let client choose based on their priorities
5. Implement the simplest solution that works
Enter fullscreen mode Exit fullscreen mode

The Cost Analysis

Development Cost

  • Manual RAG: 4 hours × $50/hr = $200
  • AI Search: 15 min × $50/hr = $12.50

Client Paid Me

  • Actual: $45 for manual implementation

What He Should Have Paid

  • If I'd recommended AI Search: $20-30 for guidance

Lessons Learned

1. Start Simple

Use the simplest solution that solves the problem. You can always add complexity later if needed.

Before: "This needs RAG, so I'll build custom everything"

After: "Does AI Search solve this? If yes, use it. If not, then custom."

2. Stay Current with Platform Features

Cloudflare ships new features constantly. I knew about Vectorize but hadn't kept up with AI Search.

Action: Set up alerts for Cloudflare changelog updates.

3. Ask Discovery Questions First

Understand requirements and constraints before proposing solutions.

Framework:

  • What's the actual business problem?
  • What are your constraints (time, budget, team)?
  • What's your risk tolerance?
  • Do you need specific technologies?

4. Present Options

Clients appreciate understanding trade-offs. Present 2-3 solutions with pros/cons.

Example:
"Here are three approaches:

  1. AI Search: Fast, simple, less flexible ($50, 1 day)
  2. Manual RAG: Full control, any LLM ($200, 3 days)
  3. Hybrid: Start simple, migrate if needed ($75, 1.5 days)"

5. Don't Over-Engineer

My ego wanted to build something impressive. The client needed something working.

Remember: Clients pay for outcomes, not impressive code.

Real-World Recommendation

Based on my experience, here's my decision framework:

Start with AI Search if:
- Simple Q&A use case ✅
- Workers AI models are good enough ✅
- Speed matters ✅
- Small team ✅

Upgrade to Manual RAG if:
- AI Search doesn't meet quality needs
- Need specific external LLM
- Require complex retrieval logic
- Have specialized requirements
Enter fullscreen mode Exit fullscreen mode

For 80% of chatbot projects, AI Search is the right choice.

Conclusion

I spent 4 hours building a sophisticated RAG system when a 15-minute AI Search implementation would have worked perfectly. The client got what he needed, but I could have saved both of us time and money.

The lesson isn't that manual RAG is wrong—it's that understanding requirements and choosing the right tool for the job is more valuable than building impressive systems.

Next time a client needs an AI chatbot, I'll start by asking: "Can AI Search solve this?" Only if the answer is "no" will I reach for the custom RAG implementation.

Sometimes the best code is the code you don't have to write.


Resources


Have you over-engineered a solution? Share your story in the comments!

Daniel Nwaneri is a full-stack developer specializing in Cloudflare Workers and AI integration. Connect with him on Upwork or GitHub.

Top comments (0)