<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: M M Islam Chisty</title>
    <description>The latest articles on DEV Community by M M Islam Chisty (@mchisty).</description>
    <link>https://dev.to/mchisty</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3876546%2F82d0a642-509d-446e-a0e1-f1339343d00c.jpg</url>
      <title>DEV Community: M M Islam Chisty</title>
      <link>https://dev.to/mchisty</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mchisty"/>
    <language>en</language>
    <item>
      <title>I built a custom AI assistant platform with Laravel + multi-provider LLMs + pgvector - here's what I learned</title>
      <dc:creator>M M Islam Chisty</dc:creator>
      <pubDate>Mon, 13 Apr 2026 11:54:54 +0000</pubDate>
      <link>https://dev.to/mchisty/i-built-a-custom-ai-assistant-platform-with-laravel-multi-provider-llms-pgvector-heres-what-5h53</link>
      <guid>https://dev.to/mchisty/i-built-a-custom-ai-assistant-platform-with-laravel-multi-provider-llms-pgvector-heres-what-5h53</guid>
      <description>&lt;p&gt;I've been a software engineer for a while. A few months ago I decided to build a SaaS product solo. This is a technical write-up of what I shipped, the architectural decisions I made, and the honest lessons from building it.&lt;/p&gt;

&lt;p&gt;The product is &lt;strong&gt;&lt;a href="https://chatnexus.cloud" rel="noopener noreferrer"&gt;ChatNexus&lt;/a&gt;&lt;/strong&gt; - a platform that lets any business deploy a custom AI assistant trained on their own data. Upload documents, paste text, crawl a URL, or sync a Google Doc - your assistant learns from your content and can answer questions, handle bookings, and run 24/7 on your website.&lt;/p&gt;

&lt;p&gt;Here's the full stack and why I chose each piece.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Stack
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Laravel 12 + Livewire 4 (PHP 8.4)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I'm a Java engineer by day. PHP wasn't the obvious choice, but Laravel's ecosystem - queues, broadcasting, Cashier, Sail - meant I could ship subscription billing, real-time chat, background jobs, and a full admin panel without stitching together five separate services. Livewire 4 kept me in a single language while still getting reactive UI. It saved weeks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-provider AI via &lt;code&gt;LlmProviderInterface&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I didn't want to be locked into one AI provider. The system is built around a &lt;code&gt;LlmProviderInterface&lt;/code&gt; and a &lt;code&gt;LlmProviderRegistry&lt;/code&gt; - chat requests are routed to whichever provider is configured for the agent's assigned model. Currently supported: &lt;strong&gt;Groq&lt;/strong&gt;, &lt;strong&gt;OpenAI&lt;/strong&gt;, &lt;strong&gt;Google Gemini&lt;/strong&gt;, and &lt;strong&gt;xAI Grok&lt;/strong&gt; - all using a shared OpenAI-compatible streaming implementation. Swapping providers is a config change, not a code change.&lt;/p&gt;

&lt;p&gt;The default model is &lt;code&gt;llama-3.1-8b-instant&lt;/code&gt; via Groq. Groq's LPU delivers tokens fast enough that streaming actually feels instant - which matters a lot for user experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OpenAI for embeddings, any provider for chat&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is a deliberate split. Embeddings are generated using OpenAI's &lt;code&gt;text-embedding-3-small&lt;/code&gt; at 768 dimensions and cached for 24 hours. Chat completions go through whichever provider is assigned to the agent. Keeping embeddings on a consistent model means the vector space stays stable - if you swap chat providers, your existing knowledge base still works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PostgreSQL + pgvector (Neon) for RAG&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I didn't want to pay for a separate vector database. pgvector runs directly in Postgres - same connection, same ORM, no extra infra. Neon is a serverless Postgres provider with pgvector support built in; it works well on a free tier for early-stage SaaS.&lt;/p&gt;

&lt;p&gt;Documents get chunked with overlap (sentence-boundary aware - the chunker tries to break at &lt;code&gt;.&lt;/code&gt; or &lt;code&gt;\n&lt;/code&gt; rather than mid-word), batch-embedded in a single API call, and stored as vectors. At query time, cosine similarity retrieves the most relevant chunks and injects them into the system prompt. Results are deterministic and fast.&lt;/p&gt;

&lt;p&gt;Knowledge base sources supported: plain text, PDF, URL (with a built-in web scraper), and &lt;strong&gt;Google Docs / Google Sheets&lt;/strong&gt; via a service account integration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Booking system built in&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each agent can have availability schedules and handle appointment bookings directly through the chat - no third-party integration required. The availability engine checks slots, prevents double-booking, and sends confirmation/cancellation/reschedule emails to both the customer and the agent owner. Default timezone is Australia/Sydney but configurable per agent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Laravel Reverb for real-time streaming&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Token streaming over WebSockets so responses appear word-by-word rather than all at once. Reverb is Laravel's first-party WebSocket server - self-hosted, no third-party dependency, no per-message billing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stripe via Laravel Cashier&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Subscription billing in under a day. Cashier handles webhooks, plan changes, trials, and invoices. I didn't write a single line of raw Stripe API code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployed on Render.com&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Web service + background worker + Redis. Docker-based. Auto-deploys on push to &lt;code&gt;master&lt;/code&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  What I Actually Learned
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;1. Split your AI responsibilities by model stability.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Using OpenAI for embeddings and allowing any provider for chat was the right call. Embeddings define the shape of your vector space - if you change the model, all existing vectors become incompatible and you need to re-embed your entire knowledge base. Keeping that layer fixed while letting the chat layer be flexible saved a painful migration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Queue workers on managed platforms need a process loop, not &lt;code&gt;exec&lt;/code&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;php artisan queue:work --max-time=3600&lt;/code&gt; exits after one hour by design (memory management). On Render, that triggers a "crashed instance" alert every hour - even though the service auto-recovers immediately. The fix is wrapping the command in &lt;code&gt;while true; do ... done&lt;/code&gt; so the container process stays alive and the worker cycles internally. One line change, eliminates recurring false-alarm alerts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Livewire 4 and UTF-8 BOM don't mix.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If a Blade file has a BOM byte at the start, Livewire's regex that attaches &lt;code&gt;wire:id&lt;/code&gt; to the root element silently misplaces it onto a child element. Every &lt;code&gt;wire:click&lt;/code&gt; and &lt;code&gt;wire:model&lt;/code&gt; handler stops working - no JS error, no visible symptom, just dead interactivity. Checking for BOM bytes is now the first thing I do when Livewire pages render but don't respond.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. "AI Agent" is the wrong positioning for a product that answers questions.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The term carries a market expectation: autonomous multi-step reasoning, tool orchestration, goal completion. If your product is primarily answering questions from a knowledge base, calling it an "agent" creates a mismatch - users feel underwhelmed. "AI Assistants" is more honest, and leading with what the assistant &lt;em&gt;does&lt;/em&gt; (answers questions, books appointments, runs 24/7 on your data) converts better than a buzzword.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Free tools are a legitimate acquisition channel.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The product ships with three publicly accessible tools - an AI Playground, a FAQ Generator, and a System Prompt Generator - that require no login. These are standalone, genuinely useful, and link back to the main product. For a solo bootstrapped SaaS with no marketing budget, giving something useful away is more effective than any ad.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Batch your embedding calls.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The first version embedded each knowledge base chunk in a separate API call. For a document with 50 chunks, that's 50 round trips. Switching to batch embedding - sending all chunks in a single request - made knowledge base ingestion roughly 10× faster for large documents. The OpenAI embeddings API supports batching natively.&lt;/p&gt;




&lt;h3&gt;
  
  
  Where It Is Now
&lt;/h3&gt;

&lt;p&gt;ChatNexus is in early access. It's live at &lt;strong&gt;&lt;a href="https://chatnexus.cloud" rel="noopener noreferrer"&gt;chatnexus.cloud&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I'm looking for beta users willing to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set up an agent on their real website&lt;/li&gt;
&lt;li&gt;Break things and tell me what's missing&lt;/li&gt;
&lt;li&gt;Give honest feedback on what's confusing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Beta access:&lt;/strong&gt; 60-day free Starter plan (3 agents, 1,000 messages/month), no credit card. Drop a comment or reach out and I'll send the promo code.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built solo. Feedback - good or bad - is genuinely appreciated.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>saas</category>
      <category>startup</category>
    </item>
  </channel>
</rss>
