This is a submission for the Agentic Postgres Challenge with Tiger Data
What I Built
I recently had a chat with a friend who works in an e-commerce company, and he was complaining that his company spent a fortune on an AI Customer Service application, only to see an increase in user complaints.
I asked him what was going on, and he showed me a real conversation:
User: "I want to return this laptop. Also, I wanted to ask, how does the warranty work if I exchange it for a new one?"
AI Customer Service: "Okay, I can process the return for you. Please provide your order number."
User: "??? What about my warranty question?"
AI Customer Service: "Sorry, how can I help you?"
Here is where the problem with router-based Conversational AI is, which routes messages to agents who own specific topics and tasks.
They inherently fail for natural, free-form conversation -- not due to implementation details you can fix, but due to fundamental architectural constraints.
In this post, we'll build a Conversational AI on Parlant to handle free-form dialogues.
Understanding Parlant Approach
Parlant is an AI alignment engine designed specially for natural customer-facing conversations.
To handle free-form dialogues, Parlant introduces the Dynamic Guideline Matching mechanism that instructs your systems to evaluate all guidelines and load only the relevant ones for each conversation turn.
# Parlant approach - no routing, just conditional guidelines
await agent.create_guideline(
condition="User asks for the refund and order amount > $500",
action="First use order status tool, confirm injectim eligibility, then provide detailed refund process",
tools=[check_order_status, calculate_refund_amount]
)
# More guidelines creation
....
Notice what's different here: there's no routing decision.
Parlant's guideline matcher evaluates all guidelines and determines which ones are contextually relevant. This is where Parlant's architecture does most of its work.
Timescale Postgres - Production-Grade Vector Database
When we delve deeper into Parlant's Dynamic Guideline Matching Mechanism, we uncover a core technical challenge:
How do we efficiently retrieve the 3-5 most relevant guidelines from potentially hundreds or even thousands of available entries, all while maintaining millisecond-level latency?
This is precisely the scenario where a Vector Database comes into play.
The Guideline Matching process in Parlant is essentially a Semantic Retrieval problem. The system converts the condition field of every guideline into a Vector Embedding. When a user input arrives, the process unfolds as follows:
- Step 1: Encode the user query and the conversation history into a Query Vector.
- Step 2: Execute the Similarity Search on the Vector Database.
- Step 3: Return the Top-K (e.g., 3-5) most relevant guidelines.
- Step 4: Inject these retrieved guidelines into the LLM's context window.
This requires the Vector Database to possess three core capabilities:
- High-performance
Approximate Nearest Neighbor (ANN)Search. - Flexible metadata filtering.
- Real-time vector updates.
Timescale Postgres provides production-grade support across all three of these dimensions.
Demo
This demo primarily showcases the core capability: using intelligent Guidelines (or Directives) to allow the Large Language Model (LLM) to answer general questions directly, while automatically triggering a Knowledge Base Retrieval for specialized or complex questions, and providing a structured, traceable answer.
If you'd like to run the demo locally, please download the repo, and follow the instructions in README.md to set up your local development environment.
How I Used Agentic Postgres
To start, you need to follow the provided instructions to create a Tiger-Cloud account. For more information, refer to this guide.
Database setup
Then follow the instructions to create a new database service. For more details on how to create a new Tiger Cloud database, refer to this guide.
Let's create a table named documents to store all documents with both text content and vector embeddings, and the necessary indexes for fast searching across different methods.
-- Enable extensions
CREATE EXTENSION IF NOT EXISTS vector;
-- Create a table for our news articles
CREATE TABLE documents (
id TEXT PRIMARY KEY,
title TEXT,
content TEXT,
embedding VECTOR(1024) -- jina-embeddings-v3 dimensions
);
-- Create indexes for performance
CREATE INDEX ON documents USING GIN (to_tsvector('english', content));
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops);
You are allowed to execute the table creation SQL in the TigerCloud console.
Implementing Hybrid Search with Jina and PgVector
In this section, we will build a hybrid search engine leveraging Jina and pgvector on Tiger Cloud. Before that, let's build an overall understanding of the architecture and concepts required.
The architecture comprises five steps:
- Documents: The process starts with a collection of documents.
- Embedding generation: a. Semantic search using Jina: Documents are processed using Cohere to create dense embeddings, which capture the semantic meaning of the text. b. Keyword search using PostgreSQL: Documents are processed using PostgreSQL to create sparse embeddings focusing on the text's specific keywords.
- Storage: Both dense and sparse embeddings are stored on Timescale’s PostgreSQL.
-
Retrieval and reranking: The system retrieves results from the Timescale database in response to a query. The results are then reranked using
Jinato prioritize the most relevant documents. - Results generation: The reranked results are compiled into a final list, generating a ranking of the most relevant documents for the query.
Here's the implementation detail in Python:
from typing import Any
from env import SEMANTIC_SCORE_THRESHOLD
class MultiSearchTool:
def __init__(self, connection, jina_client):
self.conn = connection
self.jina = jina_client
def extract_search(self, query: str, limit: int = 5):
"""Extract keyword matching for specific terms, names, dates"""
with self.conn.cursor() as cur:
# Search for exact phrase and important keywords
cur.execute("""
SELECT id, title, content, 1.0 AS score, 'exact' AS search_type
FROM documents
WHERE content LIKE %s
ORDER BY char_length(content) -- Prefr shorter, more focused articles
LIMIT %s
""", (f"%{query}%", limit))
return cur.fetchall()
def fulltext_search(self, query: str, limit: int = 5):
"""PostgreSQL full-text search with ranking"""
with self.conn.cursor() as cur:
cur.execute("""
SELECT id, title, content,
ts_rank_cd(
to_tsvector('english', content),
plainto_tsquery(%s)
) AS score,
'fulltext' AS search_type
FROM documents,
plainto_tsquery('english', %s) AS query
WHERE to_tsvector('english', content) @@ query
ORDER BY score DESC
LIMIT %s
""", (query, query, limit))
return cur.fetchall()
def semantic_search(self, query: str, limit: int=5):
"""Vector-based semantic search using Jina embeddings"""
try:
# Generate query embedding
query_embeddings = self.jina.embed([query], task="retrieval.query")
with self.conn.cursor() as cur:
cur.execute("""
SELECT * FROM (
SELECT id, title, content,
1 - (embedding <=> %s::vector) AS score,
'semantic' AS search_type
FROM documents
) a
WHERE score >= %s
ORDER BY score DESC
LIMIT %s
""", (
query_embeddings[0],
SEMANTIC_SCORE_THRESHOLD,
limit,
))
return cur.fetchall()
except Exception as e:
print(f"Semantic search failed: {e}")
return []
def combine_and_deduplicate(self, *result_sets):
"""Combine results from multiple search methods, removing duplicaties"""
seen_ids = set()
combined = []
# Process results in order of priority
for results in result_sets:
for result in results:
doc_id = result[0]
if doc_id not in seen_ids:
seen_ids.add(doc_id)
combined.append({
"id": doc_id,
"title": result[1],
"content": result[2],
"score": result[3],
"search_type": result[4]
})
return combined
def rerank_results(
self, query: str, results: list[Any], top_k: int = 5
):
"""Use Jina's rerank API for final relevance scoring"""
if not results:
return []
# Prepare documents for reranking (truncate long articles)
# Take first 2000 chars to stay within rerank limits
documents = [result["content"][:2000] for result in results]
try:
rerank_response = self.jina.rerank(
query, documents=documents, top_k=top_k
)
# Map reranked results back to original data
reranked = []
for rerank_result in rerank_response:
original_idx = rerank_result["index"]
result = results[original_idx].copy()
result["rerank_score"] = rerank_result["relevance_score"]
reranked.append(result)
return reranked
except Exception as e:
print(f"Reranking error: {e}")
return results[:top_k]
def hybrid_search(self, query: str, limit: int = 5):
"""Main hybrid search function combining all methods"""
# Cast wide net with all search methods
extract_results = self.extract_search(query, limit=limit)
fulltext_results = self.fulltext_search(query, limit=limit)
semantic_results = self.semantic_search(query, limit=limit)
# Combine and deduplicate (extract matches prioritized first)
combined = self.combine_and_deduplicate(
extract_results,
fulltext_results,
semantic_results
)
# Rerank for final relevance
final_results = self.rerank_results(query, combined, limit)
return final_results
Overall Experience
Tiger Cloud is a cloud PostgreSQL platform tailored for AI applications. This article demonstrates how to leverage its integral vector search capability to construct a conversational AI agent.
However, when deploying the application to a production environment, performance is a critical factor that must be addressed. We can utilize pgvectorscale, another extension provided by Tiger Cloud, to significantly improve application performance. Due to constraints of time and scope, a subsequent post will be dedicated to discussing the production-level usage of pgvectorscale and other performance optimization techniques.




Top comments (4)
great job!
Great Job!!
Great job
Cool, nice! Greet Job!