This is Part 5 of a 6-part series. Part 4 covers the Slack integration.
Vector Search, Feedback & Self-Improvement
This is the part that separates a demo from a production system. Most AI assistants are stateless; they forget everything between conversations. Harper Eye gets smarter every time your team uses it. Good answers get cached and returned instantly. Bad answers get flagged, degraded, and eventually purged.
Here's what the knowledge loop looks like after a few weeks of real usage:
29 knowledge base articles, all created automatically when engineers clicked "Helpful." 24 feedback items — teaching the system what to avoid. Zero degraded entries, meaning the self-healing pipeline is working. That's institutional knowledge being built by your team without anyone having to write a wiki page. The knowledge loop is the compound interest of institutional knowledge.
How the Knowledge Loop Works
The beauty of this system: no one has to manually curate the knowledge base. Engineers just click thumbs up or thumbs down. The embeddings and similarity scores handle the rest.
Step 1: The Knowledge Base — Storage & Search
This is the core of the knowledge loop. Create lib/knowledge-base.js:
import { tables } from 'harperdb';
import crypto from 'crypto';
import { generateEmbedding } from './embeddings.js';
const SIMILARITY_HIGH = 0.85; // Exact match — return cached
const SIMILARITY_LOW = 0.70; // Partial match — inject as context
const NEGATIVE_FEEDBACK_THRESHOLD = 0.80;
const NEGATIVE_RATIO_DOWNGRADE = 0.3; // 30%+ negative → downgrade
const NEGATIVE_RATIO_DELETE = 0.5; // 50%+ negative → delete
const MIN_NEGATIVES_TO_DELETE = 2; // Need at least 2 negatives
/**
* Search the knowledge base for verified answers similar to the query.
* Returns: { match: 'exact' | 'partial' | 'none', entry?, score?, degraded? }
*/
export async function searchKnowledgeBase(queryText, precomputedEmbedding = null) {
let embedding;
try {
embedding = precomputedEmbedding ?? await generateEmbedding(queryText);
} catch (err) {
console.error('Embedding failed, skipping KB search:', err.message);
return { match: 'none' };
}
try {
// Vector similarity search using Harper's built-in HNSW index
const results = tables.KnowledgeEntry.search({
sort: { attribute: 'queryEmbedding', target: embedding },
limit: 1,
});
const entries = [];
for await (const entry of results) {
entries.push(entry);
}
if (entries.length === 0) return { match: 'none' };
const topEntry = entries[0];
const storedEmbedding = topEntry.queryEmbedding;
if (!storedEmbedding?.length) return { match: 'none' };
// Compute cosine similarity
const score = cosineSimilarity(embedding, storedEmbedding);
if (score >= SIMILARITY_HIGH) {
// Check if the entry has been degraded by negative feedback
const negativeCount = topEntry.negativeCount ?? 0;
const total = (topEntry.useCount ?? 0) + negativeCount;
if (negativeCount > 0 && total > 0) {
const ratio = negativeCount / total;
if (ratio >= NEGATIVE_RATIO_DOWNGRADE) {
// Downgrade: don't return as exact, let Claude regenerate
return {
match: 'partial',
entry: serializeEntry(topEntry),
score,
degraded: true,
};
}
}
// Genuine exact match — increment use count and return
await updateUseCount(topEntry);
return { match: 'exact', entry: serializeEntry(topEntry), score };
}
if (score >= SIMILARITY_LOW) {
return { match: 'partial', entry: serializeEntry(topEntry), score };
}
return { match: 'none' };
} catch (err) {
console.error('KB search failed:', err.message);
return { match: 'none' };
}
}
Let me highlight what's happening with Harper's vector search:
tables.KnowledgeEntry.search({
sort: { attribute: 'queryEmbedding', target: embedding },
limit: 1,
});
That's it. One line. Harper's built-in HNSW index handles approximate nearest-neighbor search across all stored embeddings. No Pinecone client, no pgvector extension, no separate vector database. You defined @indexed(type: "HNSW", distance: "cosine") in your schema and Harper handles the rest.
Step 2: Storing Verified Knowledge
When an engineer clicks "Helpful," we store the question-answer pair with a vector embedding. Future similar questions will match against it.
/**
* Store a verified answer in the knowledge base.
* Called when user clicks thumbs-up.
*/
export async function storeKnowledgeEntry({
query,
answer,
sources,
originalIncidentId,
approvedByUserId,
channelId,
}) {
const embedding = await generateEmbedding(query);
const entry = {
id: crypto.randomUUID(),
query,
queryEmbedding: embedding,
answer: typeof answer === 'string' ? answer : JSON.stringify(answer),
sources: Array.isArray(sources) ? sources : [],
originalIncidentId,
approvedByUserId,
approvedAt: new Date().toISOString(),
channelId,
useCount: 0,
lastUsedAt: null,
negativeCount: 0,
};
await tables.KnowledgeEntry.put(entry);
console.log(`KB entry stored: ${entry.id} for "${query.slice(0, 80)}"`);
return entry;
}
Once stored, the embedding is automatically added to the HNSW index. The next time someone asks a semantically similar question, searchKnowledgeBase() will find it.
Step 3: The Negative Feedback Loop
This is where the system self-heals. Negative feedback does two things:
- Warns Claude about past failures so it doesn't repeat the same mistake
- Degrades bad KB entries so they stop being returned as cached answers
/**
* Search for past negative feedback on similar queries.
* Used by the orchestrator to warn Claude about past failures.
*/
export async function searchNegativeFeedback(queryText, precomputedEmbedding = null) {
let embedding;
try {
embedding = precomputedEmbedding ?? await generateEmbedding(queryText);
} catch (err) {
return [];
}
try {
const results = tables.NegativeFeedback.search({
sort: { attribute: 'queryEmbedding', target: embedding },
limit: 3,
});
const matches = [];
for await (const entry of results) {
const stored = entry.queryEmbedding;
if (!stored?.length) continue;
const score = cosineSimilarity(embedding, stored);
if (score >= NEGATIVE_FEEDBACK_THRESHOLD) {
matches.push({
category: entry.category,
details: entry.details,
originalQuery: entry.originalQuery,
score,
});
}
}
return matches;
} catch (err) {
console.error('Negative feedback search failed:', err.message);
return [];
}
}
/**
* Store negative feedback with embedding.
* If linked to a KB entry, degrades that entry.
*/
export async function storeNegativeFeedback({
queryId,
originalQuery,
category,
details,
userId,
channelId,
knowledgeEntryId,
}) {
const embedding = await generateEmbedding(originalQuery);
const record = {
id: crypto.randomUUID(),
queryId,
originalQuery,
queryEmbedding: embedding,
category,
details: details || null,
userId,
channelId,
knowledgeEntryId: knowledgeEntryId || null,
createdAt: new Date().toISOString(),
};
await tables.NegativeFeedback.put(record);
// Degrade the linked KB entry if one exists
if (knowledgeEntryId) {
await degradeKnowledgeEntry(knowledgeEntryId);
}
return record;
}
The degradation logic is where it gets interesting:
async function degradeKnowledgeEntry(entryId) {
try {
const entry = await tables.KnowledgeEntry.get(entryId);
if (!entry) return;
const newNegativeCount = (entry.negativeCount ?? 0) + 1;
const total = (entry.useCount ?? 0) + newNegativeCount;
const ratio = newNegativeCount / total;
if (newNegativeCount >= MIN_NEGATIVES_TO_DELETE && ratio >= NEGATIVE_RATIO_DELETE) {
// This answer is doing more harm than good — remove it
await tables.KnowledgeEntry.delete(entryId);
console.log(`KB entry ${entryId} DELETED (ratio: ${ratio.toFixed(2)})`);
} else {
// Just increment the counter
await tables.KnowledgeEntry.put({
...entry,
negativeCount: newNegativeCount,
});
console.log(`KB entry ${entryId} negative count → ${newNegativeCount} (ratio: ${ratio.toFixed(2)})`);
}
} catch (err) {
console.error('Failed to degrade KB entry:', err.message);
}
}
The thresholds are tuned for safety:
- 30% negative ratio → the "exact" match gets downgraded to "partial," so Claude regenerates a fresh answer using the old one as reference
- 50% negative ratio AND at least 2 negatives → the entry is deleted entirely
- This means a single bad vote can't kill an entry that's been helpful 10 times
The result: bad knowledge expires naturally. Good knowledge compounds. No human curation required.
Step 4: Cosine Similarity
The math behind the matching:
function cosineSimilarity(a, b) {
let dot = 0;
let magA = 0;
let magB = 0;
for (let i = 0; i < a.length; i++) {
dot += a[i] * b[i];
magA += a[i] * a[i];
magB += b[i] * b[i];
}
return dot / (Math.sqrt(magA) * Math.sqrt(magB));
}
We compute this manually after Harper's HNSW search because Harper returns the nearest vectors but doesn't expose the similarity score directly. The HNSW index does the heavy lifting (finding the right candidates from potentially thousands of entries); we just compute the final score on the top result.
Step 5: The Use Count Tracker
Every time a cached answer is returned, we increment its use count. This serves two purposes: analytics (which answers are most valuable?) and feedback ratio calculation (a bad answer that's been used 100 times needs more than 2 complaints to be deleted).
async function updateUseCount(entry) {
try {
const existing = await tables.KnowledgeEntry.get(entry.id);
if (!existing) return;
await tables.KnowledgeEntry.put({
...existing,
useCount: (existing.useCount ?? 0) + 1,
lastUsedAt: new Date().toISOString(),
});
} catch (err) {
console.error('Failed to update KB use count:', err.message);
}
}
function serializeEntry(entry) {
return {
id: entry.id,
query: entry.query,
answer: entry.answer,
sources: entry.sources,
approvedByUserId: entry.approvedByUserId,
approvedAt: entry.approvedAt,
useCount: entry.useCount,
negativeCount: entry.negativeCount ?? 0,
};
}
Notice serializeEntry strips the embedding vector before returning. Those are 768-element float arrays; there is no need to send them downstream or serialize them in responses.
The Full Picture
Here's what your Harper Eye knowledge pipeline looks like now:
| Scenario | What Happens | Speed |
|---|---|---|
| New question, no KB match | Full orchestration: embed → parallel search → Claude → respond | 5-15s |
| Similar question asked before (verified) | KB exact match → return cached answer | <1s |
| Similar question but answer was bad | KB match downgraded → Claude regenerates with old answer as context | 5-15s |
| Same mistake pattern detected | Negative feedback injected into Claude prompt → avoids repeating error | 5-15s |
| Terrible answer accumulates complaints | KB entry auto-deleted → system reverts to fresh orchestration | Automatic |
Every interaction makes the system better. Every thumbs-up builds institutional knowledge. Every thumbs-down teaches the system what to avoid. This is the compound interest of an AI assistant that works for your company, not a vendor's.
What's Next
In Part 6, we deploy to production, walk through the actual cost breakdown with real numbers, and explore extensions — PagerDuty webhooks for automatic incident analysis, a web dashboard for knowledge base management, and expert routing that knows who on your team to page.


Top comments (0)