As AI agents become increasingly sophisticated, the way applications interact with databases is fundamentally changing. Gone are the days of simple CRUD operations and static queries. Modern AI-powered applications require a bidirectional data flow where:
- Agents feed from the database - Using semantic search and retrieval-augmented generation (RAG) to access relevant data
- Agents feed back to the database - Storing conversation context, user interactions, and learned preferences
- Agents transform the UI - Dynamically updating search filters, results, and interface elements based on natural language understanding
In this article, I'll walk you through a production-ready rental property search application that demonstrates how MongoDB Atlas's Document Model and Vector Search capabilities make this bidirectional agent-database architecture not just possible, but elegant and performant.
Looking for realistic sample data? All of the screenshots and demos below use the 6k-listing Airbnb dataset that MongoDB published on Hugging Face: https://huggingface.co/datasets/MongoDB/airbnb_embeddings. The repo ships with
seed-hf-airbnb-data.js, which downloads that dataset, loads it into Atlas (including the vector field), and makes the entire experience turnkey.
Why MongoDB Atlas is Perfect for AI Agent Applications
Before diving into the code, let's understand why MongoDB Atlas stands out for agent-based architectures:
1. Flexible Document Model
AI agents work with diverse, semi-structured data - user conversations, property details, embeddings, and metadata. MongoDB's document model handles this naturally without rigid schemas:
{
"_id": ObjectId("..."),
"sessionId": "user-session-123",
"userId": ObjectId("..."),
"messages": [
{
"role": "user",
"content": "Find me a 2BR in Manhattan under $200",
"timestamp": ISODate("2024-01-15T10:30:00Z"),
"metadata": {
"context": { "filters": { "bedrooms": 2, "location": "New York" } }
}
},
{
"role": "assistant",
"content": "I found 15 properties matching your criteria...",
"timestamp": ISODate("2024-01-15T10:30:05Z"),
"metadata": {
"tool_calls_made": 1,
"search_performed": true,
"rental_ids": [123, 456, 789]
}
}
],
"metadata": {
"totalMessages": 2,
"lastActivity": ISODate("2024-01-15T10:30:05Z")
}
}
2. Native Vector Search
Atlas Vector Search enables semantic understanding at the database layer. No need for external vector databases or complex integrations:
{
$vectorSearch: {
index: "rental_vector_search",
path: "text_embeddings",
queryVector: [0.1234, -0.5678, ...], // 1536-dimensional embedding
numCandidates: 100,
limit: 10,
filter: {
"address.market": { $eq: "New York" },
"price": { $lte: 200 },
"bedrooms": { $gte: 2 }
}
}
}
3. Rich Querying and Aggregations
MongoDB's aggregation pipeline lets you combine vector search with traditional filters, scoring, and transformations in a single operation.
4. Unified Platform
Store embeddings, conversation history, user profiles, and application data in one database. No data synchronization headaches.
Architecture Overview: The Bidirectional Data Flow
Our rental search application demonstrates three key data flows:
┌─────────────────────────────────────────────────────────────┐
│ User Interface │
│ (Natural Language + Dynamic Filters) │
└───────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ OpenAI Agents SDK │
│ (GPT-5-mini with Custom Tools) │
└───────┬───────────────────────────┬─────────────────────────┘
│ │
│ ① Agents Feed FROM DB │ ② Agents Feed TO DB
▼ ▼
┌─────────────────────┐ ┌──────────────────────────────┐
│ Vector Search │ │ Conversation Storage │
│ • Embeddings │ │ • Chat History │
│ • Semantic Query │ │ • User Context │
│ • Filters │ │ • Search Metadata │
└─────────────────────┘ └──────────────────────────────┘
│ │
└───────────┬───────────────┘
▼
③ Agents Transform UI
┌──────────────────────────┐
│ • Update Search Filters │
│ • Display Results │
│ • Modify Interface │
└──────────────────────────┘
Let's explore each flow in detail.
Flow 1: Agents Feed FROM the Database (RAG Pattern)
The first and most critical flow is how agents access relevant data to answer user queries. This is the classic Retrieval-Augmented Generation (RAG) pattern.
Step 1: Vector Embeddings as Data Foundation
Every rental property in our database includes a 1536-dimensional embedding generated from its description, amenities, and location:
{
"_id": 12345,
"name": "Luxury Manhattan Loft",
"description": "Stunning 2-bedroom loft in heart of SoHo...",
"property_type": "Loft",
"price": 175,
"bedrooms": 2,
"address": {
"market": "New York",
"neighbourhood": "SoHo",
"country": "United States"
},
"amenities": ["WiFi", "Kitchen", "Elevator", "Gym"],
"text_embeddings": [0.023, -0.145, 0.891, ...], // ← Generated from OpenAI
"review_scores": {
"review_scores_rating": 95
}
}
Key Insight: Embeddings are stored alongside the data they represent, eliminating the need for separate vector stores and JOIN operations.
Step 2: Agent Tool Definition
Using the OpenAI Agents SDK, we define a searchRentals tool that the agent can invoke:
import { Agent, tool } from '@openai/agents';
import { z } from 'zod';
this.searchRentalsTool = tool({
name: 'searchRentals',
description: "'Search for rental properties using semantic search based on user preferences.',"
parameters: z.object({
query: z.string().describe('Natural language search query'),
filters: z.object({
min_price: z.number().nullable().optional(),
max_price: z.number().nullable().optional(),
min_bedrooms: z.number().nullable().optional(),
location: z.string().nullable().optional(),
superhost_only: z.boolean().nullable().optional()
}).nullable().optional(),
limit: z.number().default(5)
}),
execute: this.handleSearchRentals.bind(this)
});
What makes this powerful: The agent understands the tool's capabilities through the description and parameter schema, deciding when and how to invoke it based on user intent.
Step 3: Hybrid Search Implementation
When the agent invokes the tool, we perform a hybrid search combining vector similarity with traditional filters:
async hybridSearch(queryText, filters = {}, limit = 10) {
// Generate query embedding
const queryEmbedding = await this.generateEmbedding(queryText);
// Build vector search pipeline
const pipeline = [
{
$vectorSearch: {
index: "rental_vector_search",
path: "text_embeddings",
queryVector: queryEmbedding,
numCandidates: 100,
limit: limit,
filter: {
// Combine semantic + structured filters
"address.market": filters.location ? { $eq: filters.location } : undefined,
"price": {
$gte: filters.min_price || 0,
$lte: filters.max_price || 999999
},
"bedrooms": { $gte: filters.min_bedrooms || 0 },
"host.host_is_superhost": filters.superhost_only ? { $eq: true } : undefined
}
}
},
{
$project: {
name: 1,
description: "1,"
property_type: 1,
price: 1,
bedrooms: 1,
"address.market": 1,
"address.country": 1,
score: { $meta: "vectorSearchScore" } // ← Similarity score
}
}
];
return await collection.aggregate(pipeline).toArray();
}
MongoDB's Superpower Here:
- Vector search and traditional filters execute in a single database query
- No post-processing, no multiple round-trips
- Results are sorted by semantic relevance (cosine similarity)
Step 4: Agent Processes and Responds
The agent receives structured results and generates a natural language response:
async handleSearchRentals({ query, filters, limit }) {
const results = await vectorSearchService.hybridSearch(query, filters, limit);
// Format for agent consumption
const formattedResults = results.map((rental, index) => ({
rank: index + 1,
id: rental._id,
name: rental.name,
price: rental.price,
bedrooms: rental.bedrooms,
location: `${rental.address.neighbourhood}, ${rental.address.country}`,
rating: (rental.review_scores.review_scores_rating / 20).toFixed(1),
similarity_score: rental.score.toFixed(3)
}));
return JSON.stringify({
total_found: results.length,
query_used: query,
results: formattedResults
});
}
The Result: User asks "Find me a cozy apartment in Barcelona for under €150" → Agent extracts intent → Searches MongoDB with semantic understanding → Returns relevant properties.
Flow 2: Agents Feed TO the Database (Context Persistence)
What makes AI agents truly intelligent is memory. Every interaction teaches the system about user preferences and context. MongoDB's document model makes this persistence natural.
Conversation Storage Pattern
export class ConversationModel {
static async addMessage(sessionId, role, content, metadata = {}, userId = null) {
const message = {
id: new ObjectId().toString(),
role, // 'user' or 'assistant'
content,
timestamp: new Date(),
metadata: {
...metadata,
userId: userId || null,
isAuthenticated: userId !== null
}
};
// Upsert pattern: Create conversation if not exists
await collection.updateOne(
{ sessionId },
{
$push: { messages: message },
$inc: { 'metadata.totalMessages': 1 },
$set: {
updatedAt: new Date(),
'metadata.lastActivity': new Date()
},
$setOnInsert: {
userId: userId,
createdAt: new Date()
}
},
{ upsert: true }
);
}
}
Why This Matters:
- Upsert Pattern: Create conversation on first message, append to existing ones
- Nested Documents: Messages are embedded in conversation, no JOINs needed
-
Atomic Updates:
$push,$inc,$setoperations are atomic and efficient - Rich Metadata: Store context about tool calls, search results, user state
Storing Agent Metadata
After the agent responds, we capture what it did:
// Store assistant response with rich metadata
await ConversationModel.addMessage(sessionId, 'assistant', response.message, {
tool_calls_made: response.toolCalls?.length || 0,
has_rental_results: response.metadata?.search_performed || false,
search_metadata: {
query: response.metadata.search_query,
filters_applied: response.metadata.search_filters,
rental_ids: response.metadata.rental_ids // ← IDs of returned properties
},
timestamp: new Date().toISOString()
}, userId);
The Power: Later queries can reference previous searches, compare properties, or recall user preferences - all because we stored structured metadata alongside conversational content.
User Activity Tracking
MongoDB's flexible schema lets us track diverse user actions:
await UserModel.updateActivity(userId, {
$push: {
activity_log: {
action: 'search_performed',
timestamp: new Date(),
details: {
query: userMessage,
results_count: results.length,
filters_used: filters
}
}
},
$inc: { 'stats.total_searches': 1 },
$set: { 'stats.last_search_date': new Date() }
});
Real-World Use Case: Build personalized recommendations, identify power users, analyze search patterns - all from this rich behavioral data.
Flow 3: Agents Transform the UI (Dynamic Interface Updates)
The most magical aspect of agent-database integration is when the agent's understanding directly manipulates the user interface.
The Metadata Bridge
When an agent performs a search, it returns not just conversational text, but structured metadata:
{
success: true,
message: "I found 15 great properties in Barcelona under €150...",
metadata: {
search_performed: true,
search_query: "cozy apartment in Barcelona under €150",
search_filters: {
location: "Barcelona",
max_price: 150,
property_type: "Apartment"
},
rental_ids: [12345, 12346, 12347, ...]
}
}
Frontend Integration
The UI watches for this metadata and reacts:
async function sendMessage() {
const response = await fetch('/chat', {
method: 'POST',
body: JSON.stringify({
message: userInput,
context: {
current_search: searchBar.value,
filters: getCurrentFilters()
}
})
});
const data = await response.json();
// Display conversational response
displayMessage(data.message);
// Check if agent performed a search
if (data.metadata.search_performed) {
// ① Update UI filters based on agent's understanding
updateFiltersUI(data.metadata.search_filters);
// ② Fetch and display the rental results
const rentals = await fetchRentalsByIds(data.metadata.rental_ids);
displayRentals(rentals);
// ③ Update URL and browser history
updateURLParams(data.metadata.search_filters);
}
}
User Experience:
User: "Show me 2 bedroom apartments in Manhattan under $200"
↓
Agent: [Understands intent, extracts filters, searches MongoDB]
↓
UI: ✨ Location dropdown changes to "New York"
✨ Bedrooms filter updates to "2+"
✨ Price slider moves to "$0-$200"
✨ Results grid displays matching properties
✨ Chat shows: "I found 15 properties matching your criteria..."
Bidirectional Filter Sync
The genius is that filters work both ways:
- Manual Filter → Agent Context: User adjusts UI filters → Passed to agent in next message
- Agent Understanding → UI Filters: Agent extracts intent from natural language → Updates UI filters
// Sending filter context to agent
const chatPayload = {
message: userInput,
context: {
filters: {
location: locationDropdown.value,
min_price: priceSlider.min,
max_price: priceSlider.max,
bedrooms: bedroomFilter.value
}
}
};
// Agent enhances message with current filter state
if (context.filters && Object.keys(context.filters).length > 0) {
enhancedMessage += ` Current filters: ${formatFilters(context.filters)}`;
}
Why This Works: MongoDB stores both the agent's understanding (in conversation metadata) and the current UI state (in user preferences), creating a single source of truth.
Advanced Patterns: Going Beyond Basic RAG
Pattern 1: Saved Rentals with Agent Integration
Users can save favorite properties, and the agent accesses this data:
this.getSavedRentalsTool = tool({
name: 'getSavedRentals',
description: 'Get the user\'s saved rental properties for comparison and recommendations.',
parameters: z.object({
includeDetails: z.boolean().default(false)
}),
execute: async ({ includeDetails }) => {
const savedRentals = await UserModel.getSavedRentals(userId);
if (includeDetails) {
// Fetch full property data using rental IDs
const detailedRentals = await Promise.all(
savedRentals.map(saved => RentalModel.findById(saved.rental_id))
);
return JSON.stringify(detailedRentals);
}
return JSON.stringify(savedRentals);
}
});
User Experience:
User: "Compare my saved properties in terms of price and location"
↓
Agent: [Calls getSavedRentals with includeDetails=true]
↓
MongoDB: Returns full property documents
↓
Agent: "Here's a comparison of your 3 saved properties:
1. Manhattan Loft ($175/night) - SoHo, great for nightlife
2. Barcelona Apartment (€120/night) - Gothic Quarter, historic charm
3. Sydney Studio ($140/night) - Bondi, beach vibes
The Barcelona option offers the best value, while Manhattan is ideal
if you prioritize being in the center of the action."
Pattern 2: Context-Aware Property Details
When a user views a property, that context is passed to the agent:
const chatPayload = {
message: "Tell me about the neighborhood",
context: {
current_property: {
id: 12345,
name: "Luxury Manhattan Loft",
location: { market: "New York", neighbourhood: "SoHo" },
features: { bedrooms: 2, price: 175 }
}
}
};
The agent receives this context and provides targeted advice:
if (context.current_property) {
const property = context.current_property;
enhancedMessage += ` User is currently viewing: "${property.name}" in ${property.location.neighbourhood}`;
}
Agent Response: "SoHo is one of Manhattan's most vibrant neighborhoods, known for its cast-iron architecture, upscale boutiques, and art galleries. You'll be walking distance from great restaurants and nightlife. At $175/night for a 2-bedroom, this is competitive for the area."
Pattern 3: Hybrid Search with Scoring
Combine vector similarity with business logic:
const pipeline = [
{
$vectorSearch: {
index: "rental_vector_search",
path: "text_embeddings",
queryVector: queryEmbedding,
numCandidates: 100,
limit: 50 // Get more candidates for scoring
}
},
{
$addFields: {
vector_score: { $meta: "vectorSearchScore" },
rating_score: {
$divide: ["$review_scores.review_scores_rating", 100]
},
superhost_bonus: {
$cond: ["$host.host_is_superhost", 0.1, 0]
}
}
},
{
$addFields: {
final_score: {
$add: [
{ $multiply: ["$vector_score", 0.6] }, // 60% semantic relevance
{ $multiply: ["$rating_score", 0.3] }, // 30% ratings
{ $multiply: ["$superhost_bonus", 0.1] } // 10% superhost boost
]
}
}
},
{
$sort: { final_score: -1 }
},
{
$limit: 10
}
];
Result: Properties ranked by a combination of semantic relevance, user ratings, and business rules - all computed in MongoDB.
MongoDB Atlas Setup for Production
1. Vector Search Index Configuration
{
"fields": [
{
"numDimensions": 1536,
"path": "text_embeddings",
"similarity": "cosine",
"type": "vector"
},
{
"path": "property_type",
"type": "filter"
},
{
"path": "address.market",
"type": "filter"
},
{
"path": "price",
"type": "filter"
},
{
"path": "bedrooms",
"type": "filter"
},
{
"path": "host.host_is_superhost",
"type": "filter"
}
]
}
Key Points:
-
vectorfield for semantic search -
filterfields for structured filtering - Cosine similarity for 1536-dim OpenAI embeddings
2. Supporting Indexes
// Conversation history lookup
db.conversations.createIndex({ "sessionId": 1 });
db.conversations.createIndex({ "userId": 1, "metadata.lastActivity": -1 });
// User activity queries
db.users.createIndex({ "username": 1 }, { unique: true });
db.users.createIndex({ "saved_rentals.rental_id": 1 });
// Rental property queries
db.rentals.createIndex({ "address.market": 1, "price": 1 });
db.rentals.createIndex({ "bedrooms": 1, "accommodates": 1 });
3. Aggregation Pipeline Optimization
Use $project early to reduce data transfer:
{
$vectorSearch: { /* ... */ }
},
{
$project: {
name: 1,
price: 1,
bedrooms: 1,
"address.market": 1,
score: { $meta: "vectorSearchScore" }
// Only fetch what you need
}
}
Performance Considerations
Embedding Generation Strategy
// Cache embeddings at data ingestion
async function seedRental(rental) {
const embeddingText = `${rental.name}. ${rental.description}.
Located in ${rental.address.market}, ${rental.address.country}.
${rental.property_type} with ${rental.bedrooms} bedrooms.
Amenities: ${rental.amenities.join(', ')}.`;
rental.text_embeddings = await generateEmbedding(embeddingText);
await db.rentals.insertOne(rental);
}
Never generate embeddings at query time - pre-compute and store them.
Conversation History Management
// Limit conversation history to last 20 messages
const conversation = await collection.findOne(
{ sessionId },
{
projection: {
messages: { $slice: -20 }, // Only get last 20
metadata: 1
}
}
);
Why: Sending entire conversation history to LLMs is expensive. Recent context is usually sufficient.
Connection Pooling
const client = new MongoClient(uri, {
maxPoolSize: 50,
minPoolSize: 10,
maxIdleTimeMS: 30000
});
Production Tip: Pool size should match expected concurrent users/requests.
Security Best Practices
1. User-Scoped Data Access
// NEVER trust client-provided userId
const userId = await verifyJWT(authToken);
// All queries scoped to authenticated user
const savedRentals = await db.users.findOne(
{ _id: ObjectId(userId) },
{ projection: { saved_rentals: 1 } }
);
2. Input Sanitization
// Validate and sanitize before DB operations
const filters = {
min_price: Math.max(0, parseInt(filters.min_price) || 0),
max_price: Math.min(10000, parseInt(filters.max_price) || 10000),
location: sanitizeString(filters.location)
};
3. Rate Limiting
// Track API usage per user
await db.users.updateOne(
{ _id: userId },
{
$inc: { 'rate_limits.api_calls_today': 1 },
$set: { 'rate_limits.last_call': new Date() }
}
);
Real-World Results: What We Achieved
Performance Metrics
- Average search latency: 150-300ms (embedding generation + vector search + formatting)
- Vector search alone: 50-80ms for 5,000+ properties
- Conversation storage: <10ms per message (upsert with indexing)
- Concurrent users: Tested up to 100 simultaneous chat sessions
User Experience Wins
- Natural language accuracy: 90%+ intent extraction on first try
- Filter synchronization: Seamless bidirectional updates
- Context retention: Agent remembers previous searches and user preferences
- Multi-turn conversations: Supports complex, multi-step property searches
Developer Experience
- Single database: No data synchronization between vector DB and app DB
- Unified query language: MongoDB aggregation for everything
- Flexible schema: Add new metadata fields without migrations
- Rich ecosystem: Works with Mongoose, native driver, Prisma, etc.
Lessons Learned & Best Practices
1. Design Your Document Schema for Agent Access
// ❌ Bad: Deeply nested, agent can't navigate
{
"data": {
"property_info": {
"details": {
"location": { ... }
}
}
}
}
// ✅ Good: Flat, predictable structure
{
"name": "...",
"address": { "market": "...", "country": "..." },
"price": 150,
"bedrooms": 2
}
2. Include Both Structured and Unstructured Data
{
"name": "Cozy Manhattan Loft",
"description": "Full natural language description...", // ← For embeddings
"property_type": "Loft", // ← For filtering
"bedrooms": 2, // ← For filtering
"amenities": ["WiFi", "Kitchen"], // ← For filtering
"text_embeddings": [...] // ← For vector search
}
3. Store Agent Metadata Richly
// Don't just store the conversation
{
"role": "assistant",
"content": "I found 5 properties..."
}
// Store what the agent DID
{
"role": "assistant",
"content": "I found 5 properties...",
"metadata": {
"tool_calls": ["searchRentals"],
"filters_applied": { "location": "New York", "max_price": 200 },
"rental_ids": [123, 456],
"user_satisfied": true // Track based on follow-up
}
}
4. Optimize for Agent Token Limits
// Return concise summaries to the agent
const formattedResults = results.map(r => ({
id: r._id,
name: r.name,
price: r.price,
location: `${r.address.market}, ${r.address.country}`,
bedrooms: r.bedrooms
// Skip description, images, etc. - retrieve on-demand
}));
5. Enable Agent Self-Discovery
// Provide tools for agents to explore data
this.exploreDataTool = tool({
name: 'exploreAvailableMarkets',
description: 'Get list of available cities/markets in the database',
execute: async () => {
const markets = await db.rentals.distinct('address.market');
return JSON.stringify(markets);
}
});
The Future: What's Next for Agent-Database Integration
1. Agent-Driven Schema Evolution
Imagine agents that suggest new fields based on user queries:
Agent: "I notice users frequently ask about 'pet-friendly' properties,
but this field doesn't exist. Should I add it to the schema?"
2. Semantic Caching
MongoDB could cache embedding+filter combinations:
{
"query_hash": "sha256(...)",
"embedding": [...],
"filters": { "location": "New York" },
"cached_results": [...],
"valid_until": ISODate("2024-01-15T12:00:00Z")
}
3. Multi-Agent Coordination
Different specialized agents sharing the same MongoDB instance:
- Search Agent: Finds properties
- Booking Agent: Handles reservations
- Recommendation Agent: Suggests based on history
- All coordinating through shared conversation and user state
4. Continuous Learning from Feedback
// User indicates result quality
{
"search_query": "cozy apartment in Barcelona",
"results_shown": [123, 456, 789],
"user_clicked": 456, // Implicit feedback
"user_saved": [456], // Strong signal
"user_booked": 456 // Conversion
}
Use this data to fine-tune embeddings or ranking algorithms.
Conclusion: MongoDB Atlas as the Foundation for Intelligent Applications
Building AI agents that truly understand and serve users requires more than just a language model. You need a database that:
✅ Stores semantic understanding (vectors) alongside structured data (filters)
✅ Handles dynamic, evolving schemas (conversations, metadata, user context)
✅ Enables bidirectional data flow (agents read, write, and transform)
✅ Performs at scale (millisecond searches across thousands of documents)
✅ Provides a unified platform (no juggling multiple databases)
MongoDB Atlas delivers all of this with its Document Model and Vector Search capabilities. As we've seen in this rental search application:
- Agents feed FROM the database using semantic vector search combined with traditional filters
- Agents feed TO the database by storing rich conversation context and metadata
- Agents transform the UI through structured metadata that synchronizes with interface elements
This bidirectional architecture represents the future of AI-powered applications. And MongoDB Atlas makes it not just possible, but elegant, performant, and production-ready.
Try It Yourself
The complete code for this project is available on GitHub: mongodb-openai-agentic-rentals
Quick Start:
git clone https://github.com/mongodb-developer/mongodb-openai-agentic-rentals.git
cd mongodb-openai-agentic-rentals
bun install
# Configure .env with your MongoDB Atlas URI and OpenAI API key
node seed-hf-airbnb-data.js
bun start
# Visit http://localhost:5000/index.html
What to explore:
- Try natural language queries: "Find me a beachfront property in Sydney"
- Watch the UI filters update automatically
- Check the MongoDB conversation collection to see stored context
- Examine the aggregation pipelines in
src/services/vector-search.service.js - Extend the agent with new tools in
src/agents/rental-rag-agent.js
Additional Resources
- MongoDB Atlas Vector Search Documentation
- OpenAI Agents SDK
- MongoDB Aggregation Pipeline
- Document Model Design Best Practices
About the Author: Pavel Duchovny is a Developer Advocate at MongoDB, passionate about helping developers build intelligent, scalable applications. Connect on Twitter or LinkedIn.
Have questions or feedback? Open an issue on the GitHub repo or reach out to the MongoDB Developer Community.

Top comments (0)