Most RAG architectures charge you $300+/month for vector databases that run whether you're querying or not. RAGStack-Lambda scales to zero. $7-10/month for 1,000 documents.
The trick is S3 Vectors + Lambda + Bedrock. You trade sub-50ms latency for hundreds of milliseconds. For chat interfaces and document Q&A, that's fine.
Beyond Text Search
Amazon Nova embeddings put text, images, and video frames in the same vector space. Upload a photo, search with natural language, get semantically relevant results.
For video: frames get visual embeddings and audio gets transcribed into 30-second chunks with speaker identification. Every chunk carries timestamp metadata. Query by what's said or what's shown — citations link directly to that segment.
Smarter Retrieval
RAGStack doesn't just embed your content. It analyzes it.
Metadata extraction examines each document and pulls structured fields automatically — topic, document type, date range, whatever's relevant.
Filter generation samples your knowledge base and creates few-shot examples based on what it finds. No manual curation.
Multi-slice queries run parallel retrievals using those generated filters. Instead of one broad search, you get multiple targeted queries returning more relevant results.
The Stack
- One-click AWS Marketplace deployment
- Framework-agnostic web component (one script tag)
- MCP server for Claude Desktop, Cursor, VS Code
- Everything runs in your account — no external control plane
Login: guest@hatstack.fun / Guest@123
Top comments (0)