Every developer building with LLMs eventually faces the same problem —
“How do I manage chat memory, vector search, and real-time conversation without reinventing the backend every time?”
That’s exactly what inspired Vezlo Phase 2 — introducing the Vezlo AI Assistant Server, an open-source, production-ready AI backend built for developers who want to move fast without breaking things.
🧩 What Is the Vezlo AI Assistant Server?
The Vezlo AI Assistant Server is a full-stack backend framework that helps you build, run, and scale your own AI assistants — powered by Node.js, TypeScript, and Supabase.
It gives you a plug-and-play server to handle:
- 💬 Real-time chat with WebSockets (Socket.io)
- 🔍 Semantic search using pgvector and Supabase
- 🧠 Persistent conversation memory
- ⚙️ REST APIs + WebSocket endpoints
- 🐳 Docker support for production
- ☁️ One-click deploy to Vercel
In short — it’s your AI backend in a box, no external dependencies or lock-ins.
⚙️ Why Developers Love It
If you’ve ever tried building an AI assistant manually, you know the pain:
- Setting up embeddings + vector search
- Writing chat memory logic
- Managing OpenAI calls efficiently
- Handling API rate limits
- Deploying scalable infra
Vezlo solves all of this out of the box. You just install, configure your keys, and get a working assistant server running locally or on Vercel.
npx vezlo-setup
That’s literally it.
🧱 Key Features (Built for Developers)
Feature | Description |
---|---|
🧠 AI Brain | Connect any LLM (OpenAI, Anthropic, Gemini, etc.) |
🔍 Vector Search | Supabase + pgvector integration for semantic search |
💬 Real-Time Chat | WebSocket powered assistant conversations |
⚙️ REST + WebSocket APIs | Easy to integrate into any frontend (React, Next.js, Vue) |
🗄️ Conversation Memory | Stores user sessions and chat history automatically |
☁️ One-Click Deploy | Deploy on Vercel or use Docker in production |
🔐 TypeScript Support | Fully typed for predictable, safe developer experience |
🧰 Tech Stack Under the Hood
The Vezlo AI Assistant Server is built with modern developer tooling:
- Node.js + TypeScript — core backend
- Supabase + pgvector — vector embeddings & semantic search
- Socket.io — real-time message streaming
- OpenAI SDK — default LLM integration
- Vercel + Docker — deployment and scaling
This setup means you can start small (local) and scale seamlessly to production.
🧠 How It Works (Simplified Flow)
-
User sends a message →
/chat
endpoint - Vezlo Server retrieves past memory + relevant context from Supabase
- LLM API call (OpenAI by default)
- Response generated and streamed back via WebSocket
- Conversation & embeddings are stored automatically
You get persistence, memory, and semantic understanding — all without writing custom logic.
🧪 Example Setup
Here’s how to spin up your own AI backend in minutes:
# 1. Install globally
npm install -g @vezlo/assistant-server
# 2. Initialize project
npx vezlo-setup
# 3. Add your OpenAI + Supabase keys
# 4. Start server
npm run dev
You’re ready to chat with your assistant instantly.
🌐 Deploy in One Click
Vezlo supports instant deployment to Vercel.
Or, for production-grade control, run it in Docker with:
docker compose up
That’s all it takes to host your own open source AI backend — no proprietary APIs, no vendor lock-in.
💡 Why We Built This
Phase 1 of Vezlo helped developers turn source code into knowledge bases.
But developers wanted more — not just documentation or embeddings, but a full backend that could chat, search, and remember.
So we built Phase 2: AI Assistant Server, giving devs the power to:
- 🤝 Integrate AI assistants directly into SaaS apps
- 🧑💻 Build custom copilots for engineering teams
- ⚙️ Prototype AI workflows faster than ever
⚡ What’s Next
This is just the start. Upcoming updates will include:
- 🔗 Vector adapters for Pinecone, Qdrant, and Chroma
- 🧠 Built-in agent workflows
- 💬 UI SDK for chat + embeddings dashboard
- ☁️ Self-hosting setup on Render, Fly.io, and AWS
💬 Try It Today
Get started in minutes:
🔗 GitHub: https://github.com/vezlo/assistant-server
📦 npm install -g @vezlo/assistant-server
🌐 Docs: Vezlo Documentations
Your AI backend shouldn’t take weeks to build.
With Vezlo, it takes one command.
Top comments (0)