🧠Part 3: Implementing Vector Search with Pinecone
In this part, we’ll integrate Pinecone, a vector database that enables semantic search. This allows the chatbot to understand user queries and fetch relevant order information—even if the phrasing varies.
âś… What We'll Cover
- Introduction to vector databases and embeddings
- Setting up Pinecone
- Creating embeddings from order data
- Storing and retrieving vectors from Pinecone
🧠1. Why Vector Search?
Traditional keyword search has limits. With vector search, we embed data (like order summaries) into high-dimensional vectors using LLMs. These vectors can then be compared to the user’s query (also embedded) to find semantically similar matches.
🔧 2. Set Up Pinecone
Install Pinecone client (already done in Part 1, just in case):
npm install @pinecone-database/pinecone
Update your .env
if not already done:
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_ENVIRONMENT=your_pinecone_environment
Initialize Pinecone in your config:
// backend/langchain/config.js
const { PineconeClient } = require('@pinecone-database/pinecone');
const initPinecone = async () => {
const client = new PineconeClient();
await client.init({
apiKey: process.env.PINECONE_API_KEY,
environment: process.env.PINECONE_ENVIRONMENT,
});
return client;
};
🧬 3. Generate Embeddings
We’ll create vector representations of order data using LangChain’s OpenAI embeddings.
// backend/langchain/embedOrders.js
const { OpenAIEmbeddings } = require('@langchain/openai');
const { initPinecone } = require('./config');
const { connectToDatabase } = require('../database/connection');
async function embedOrders() {
const db = await connectToDatabase();
const orders = await db.collection('orders').find().toArray();
const embeddings = new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY,
});
const pinecone = await initPinecone();
const index = pinecone.Index("ecommerce-orders");
for (const order of orders) {
const orderSummary = `
Order ID: ${order.orderId}
Customer: ${order.customerName}
Items: ${order.items.map(i => i.productName).join(', ')}
Status: ${order.status}
`;
const [embedding] = await embeddings.embedQuery(orderSummary);
await index.upsert([
{
id: order.orderId,
values: embedding,
metadata: {
orderId: order.orderId,
customerName: order.customerName,
status: order.status,
},
},
]);
}
console.log("All orders embedded and stored in Pinecone.");
}
embedOrders();
Run the script:
node backend/langchain/embedOrders.js
🔍 4. Perform a Semantic Search
We’ll later use this to match user queries to order information.
// backend/langchain/searchOrder.js
const { OpenAIEmbeddings } = require('@langchain/openai');
const { initPinecone } = require('./config');
async function searchOrders(userQuery) {
const embeddings = new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY,
});
const pinecone = await initPinecone();
const index = pinecone.Index("ecommerce-orders");
const [queryVector] = await embeddings.embedQuery(userQuery);
const results = await index.query({
vector: queryVector,
topK: 3,
includeMetadata: true,
});
return results.matches;
}
module.exports = { searchOrders };
âś… Next Steps (Part 4)
In the next part, we will:
- Connect LangChain components
- Load data from MongoDB
- Split and embed text dynamically
- Retrieve and generate responses
🚀 Stay tuned for Part 4!
Top comments (0)