AI agents are becoming a new interface for finding people.
Instead of opening a marketplace, typing keywords, filtering profiles, and manually deciding who is worth contacting, a user can now say something like:
Help me find a few SaaS founders who might need my backend architecture services.
or:
Find remote Rust freelancers who have experience with early-stage infrastructure products.
In Opportunity Skill, the user's AI agent turns that request into a semantic search query, calls the QuestMeet backend, receives a compact list of matched candidates, and then drafts tailored collaboration proposals for the user to approve.
This post is a technical walkthrough of the backend search function behind that flow.
The interesting part is not just "use embeddings". The search engine combines:
- PostgreSQL
- pgvector cosine distance
- HNSW indexes
- tag-level semantic recall
- active-user filtering
- cubic similarity scoring
-
LATERAL JOINimpression reranking - separate buyer/professional identity perspectives
The goal is simple:
Given a natural-language request from an AI agent, return the candidates worth contacting, together with enough semantic context for the agent to explain the match and write a good proposal.
What Opportunity Skill does
Opportunity Skill is an Agent Skill that makes a user discoverable to other agents. It supports agent products that follow the Skill specification, such as Claude Code and OpenClaw. The user does not need to manually browse a website. The agent calls functions exposed by the skill.
At a high level, the skill has four processes:
The search engine discussed in this article powers the Search and Contact process.
When the user asks the agent to find buyers or professionals, the agent calls one of these functions:
ai_search_buyersai_search_professionals
These functions are implemented in the skill's scripts/callable_functions.py file and communicate with QuestMeet through GraphQL.
A simplified version looks like this:
import httpx
from typing import Union
BASE_URL = "https://questmeet.ai/graphql"
def ai_search_professionals(access_token: str, query: str) -> Union[list, bool, None]:
try:
response = httpx.post(
BASE_URL,
json={
"query": """
query AiSearchProfessionals($query: String!) {
aiSearchProfessionals(query: $query)
}
""",
"variables": {"query": query},
},
headers={"Authorization": f"Bearer {access_token}"},
trust_env=False,
timeout=20,
)
return response.json()["data"]["aiSearchProfessionals"]
except Exception:
return False
The return values have clear semantics for the agent:
| Return value | Meaning |
|---|---|
list[dict] |
Relevant candidates were found |
[] |
The request succeeded, but no relevant candidates were found |
None |
The access token is missing or expired; the agent should re-authenticate |
False |
Something failed; notify the user and stop |
This is important because the agent, not the server, owns the workflow. If the token is expired, the skill instructs the agent to run the sign-in process, obtain a new token, and retry.
The GraphQL entry points
On the server side, the two public search fields are small wrappers around the same internal function:
@strawberry.field
async def ai_search_buyers(self, info: Info, query: str) -> Optional[JSON]:
try:
return await search_buyers_or_professionals(info, query, "Buyer")
except Exception:
return False
@strawberry.field
async def ai_search_professionals(self, info: Info, query: str) -> Optional[JSON]:
try:
return await search_buyers_or_professionals(info, query, "Professional")
except Exception:
return False
The only difference is the perspective argument:
-
"Buyer"means we search for users as buyers, including employers or clients. -
"Professional"means we search for users as professionals, including freelancers and employees.
This distinction is not cosmetic. The same human can be both a buyer and a professional. A founder may want to hire developers, while also being discoverable as a product consultant. These two identities should not share the same matching context.
That is why each user has two external candidate IDs:
professional_id VARCHAR(50) UNIQUE DEFAULT gen_random_uuid()::text,
buyer_id VARCHAR(50) UNIQUE DEFAULT gen_random_uuid()::text
When a user is returned as a professional, the API returns professional_id as candidate_id.
When a user is returned as a buyer, the API returns buyer_id as candidate_id.
The agent receives a unified candidate_id and does not need to know which internal column was used.
The data model
The search function touches five tables:
usersloginstagsimpressionsimpression_tags
Here is the simplified relationship:
The core idea is that the user's profile is not just a display profile written for humans.
It is a set of impressions written for AI agents to search and reason over.
An impression is a structured statement about a user's expertise, collaboration style, communication preference, leadership style, taste, or requirements.
For example, a professional's impression might say:
This developer prefers projects where technical architecture decisions are made explicitly. They value strict type definitions, maintainable interfaces, and long-term code evolution over quick prototypes.
Each impression is associated with 1 to 5 tags:
impressions_with_tags_format = {
"type": "array",
"items": {
"type": "object",
"properties": {
"impression": {"type": "string"},
"tags": {
"type": "array",
"items": {"type": "string"},
"maxItems": 5,
},
},
"required": ["impression", "tags"],
"additionalProperties": False,
},
}
The tags are not used as a traditional keyword system. They are embedded into vector space and used as a lightweight semantic recall layer.
The full search pipeline looks like this:
- The agent calls ai_search_buyers or ai_search_professionals.
- The GraphQL resolver receives the request.
- Check whether an authenticated user_id exists.
- If no: return None, prompting the agent to re-authenticate.
- If yes: proceed to the next step.
- Vectorize the natural-language query.
- Search tags using pgvector cosine distance.
- Keep tags with distance <= 0.55, limiting to 100 tags.
- Map matched tags back to users via the impression_tags join table.
- Keep only public impressions.
- Keep only users active in the last month.
- Score users by summing the cube of tag similarity (similarity³).
- Take the top 50 users using heapq.nlargest.
- Exclude the current user from the results.
- Rerank each candidate's impressions using a LATERAL JOIN.
- Return the candidate's name, badge, candidate_id, description, and impressions.
Step 1: Auth guard
The internal function begins with an authentication guard:
async def search_buyers_or_professionals(
info: Info,
query: str,
perspective: str,
) -> Optional[Union[list[dict], bool]]:
if logged_user_id := info.context["user_id"]:
if len(embedding_models) == 1:
embeddings = await vectorize_contents(embedding_models[0], [query])
# ... tag-level recall, cubic scoring, and LATERAL JOIN reranking ...
# if candidates found, returns list[dict] here
elif len(embedding_models) == 2:
embeddings = await vectorize_contents(embedding_models[0], [query])
other_embeddings = await vectorize_contents(embedding_models[1], [query])
# ... tag-level recall, cubic scoring, and LATERAL JOIN reranking ...
# if candidates found, returns list[dict] here
return [] # executed only when no candidates matched
return None
In the QuestMeet GraphQL service, info.context["user_id"] is populated after the access token is verified.
If it is missing, the function returns None.
The server does not attempt to redirect or refresh the token. It only tells the agent:
You are not authenticated for this operation.
The skill then instructs the agent to run the sign-in process again, store the new access token in long-term memory or a local .txt file, and retry the original process.
This keeps the backend simple and makes the agent responsible for workflow recovery.
Step 2: Vectorize the query
The user request arrives as natural language:
Find remote Rust freelancers who have experience with early-stage infrastructure products.
Before querying PostgreSQL, the server embeds it:
embeddings = await vectorize_contents(embedding_models[0], [query])
The database schema uses 1536-dimensional vectors:
odd_embedding vector(1536),
even_embedding vector(1536)
The search function then uses the query embedding to search semantically related tags and impressions.
Step 3: Tag-level semantic recall
The first database query searches the tags table:
SELECT tag_id, distance
FROM (
SELECT tag_id, ({embedding_column} <=> $1::vector) AS distance
FROM tags
) AS tag_distances
WHERE distance <= 0.55
ORDER BY distance ASC
LIMIT 100
The <=> operator is pgvector's cosine distance operator.
Cosine distance is:
distance = 1 - cosine_similarity
So:
similarity = 1.0 - distance
The filter:
WHERE distance <= 0.55
means:
cosine similarity >= 0.45
This is intentionally not too strict.
If the threshold is too high, the search becomes brittle and misses useful matches. If it is too low, the search lets in too much noise. In this system, tag search is the recall stage, so the threshold should be broad enough to catch potentially relevant concepts while still filtering out unrelated tags.
The LIMIT 100 prevents broad queries from pulling too many tags into the next stage.
The tags table has HNSW indexes on both embedding columns:
CREATE INDEX i_tags_odd_embedding
ON tags
USING hnsw (odd_embedding vector_cosine_ops)
WITH (m = 32, ef_construction = 128);
CREATE INDEX i_tags_even_embedding
ON tags
USING hnsw (even_embedding vector_cosine_ops)
WITH (m = 32, ef_construction = 128);
The goal is to make semantic tag recall fast even as the tag vocabulary grows.
Step 4: Map tags back to active public users
After retrieving semantically related tags, the server maps them back to users through impression_tags:
tag_distances = {
row["tag_id"]: float(row["distance"])
for row in rows
}
tag_ids = list(tag_distances.keys())
Then:
SELECT impression_tags.tag_id, impression_tags.user_id
FROM impression_tags
JOIN (
SELECT logins.user_id
FROM logins
WHERE logins.updated_at >= NOW() - '1 month'::interval
GROUP BY logins.user_id
) AS active_users
ON impression_tags.user_id = active_users.user_id
WHERE impression_tags.tag_id = ANY($1::bigint[])
AND impression_tags.is_public = TRUE
This query does three things.
1. It uses impression_tags as an inverted index
impression_tags connects impressions and tags.
If a tag such as "Rust" or "Type Safety" is semantically close to the query, the join table tells us which users have public impressions associated with that tag.
2. It only searches public impressions
impression_tags.is_public = TRUE
A user may have private impressions used only for self-reflection or agent memory. Those should not be discoverable by other users' agents. This check acts as the first layer of filtering during tag-level recall.
3. It filters for recent activity
logins.updated_at >= NOW() - '1 month'::interval
A matching profile is only useful if the person is still active.
Traditional candidate-search products often surface abandoned profiles. For agent-driven collaboration, that wastes the user's time. This query only keeps users who have logged in within the last month.
The relevant index is:
CREATE INDEX i_impression_tags_tag_user_public
ON impression_tags (tag_id, user_id)
WHERE is_public IS TRUE;
This partial index keeps the public-tag reverse lookup efficient.
Step 5: Cubic similarity scoring
For every matched (tag_id, user_id) pair, the server converts cosine distance back to similarity:
similarity = 1.0 - tag_distances[row["tag_id"]]
Then it adds the cube of the similarity to the user's score:
record_id = row["user_id"]
if record_id in record_scores:
record_scores[record_id] += similarity * similarity * similarity
else:
record_scores[record_id] = similarity * similarity * similarity
Why cube the similarity?
Because weak semantic matches should not dominate the ranking just because there are many of them.
After the distance <= 0.55 filter, similarity is in the range:
[0.45, 1.0]
Compare linear scoring with cubic scoring:
| Linear score | Cubic score |
|---|---|
| 0.99 | 0.970299 |
| 0.90 | 0.729 |
| 0.70 | 0.343 |
| 0.45 | 0.091125 |
A very strong tag match remains strong. A weak-but-acceptable tag match becomes much less important.
This matters because tags are not hand-written search keywords. They are concepts extracted from impressions. A user may have several broad tags that are not wrong but should not make them rank above someone with fewer, stronger, more specific matches.
For example, if the query is:
TypeScript type-safety architecture consultant
A candidate with strong matches on:
TypeScriptType SafetySoftware Architecture
should outrank someone who only has broad tags like:
JavaScriptWeb DevelopmentFrontend
Cubic scoring makes that more likely.
Step 6: Keep the top 50 candidates
After scoring users, the server selects the top 50 candidates:
user_ids = [
user_id
for user_id, _ in heapq.nlargest(
50,
record_scores.items(),
key=lambda item: item[1],
)
if user_id != logged_user_id
]
There are two details here.
First, heapq.nlargest(50, ...) avoids fully sorting the entire candidate set. For large candidate sets, this is cheaper than sorting everything and slicing.
Second, the current user is excluded, so the final candidate count may be fewer than 50.
The top-50 list is not the final set of results. It is a small candidate set for the next stage: impression-level reranking.
Step 7: Impression-level reranking with LATERAL JOIN
Tag-level recall is fast and broad, but the final results should be based on the actual impression text.
This is where LATERAL JOIN is useful.
SELECT
users.{id_column},
users.name,
users.description,
users.badge,
string_agg('- ' || content, E'\n')
FROM unnest($1::bigint[]) AS user_ids(user_id)
JOIN users
ON users.user_id = user_ids.user_id
JOIN LATERAL (
SELECT content
FROM (
SELECT
content,
({embedding_column} <=> $3::vector) AS distance
FROM impressions
WHERE user_id = users.user_id
AND perspective = $2
AND is_public = TRUE
) AS impressions
WHERE distance <= 0.55
ORDER BY distance ASC
LIMIT 10
) AS impressions ON true
GROUP BY users.{id_column}, users.name, users.description, users.badge
This query does a lot.
unnest turns the Python list into rows
The server passes the top candidate user IDs as an array.
unnest($1::bigint[]) AS user_ids(user_id)
This lets PostgreSQL process all candidates in one query instead of making 50 separate round trips.
LATERAL JOIN runs a per-user subquery
A normal subquery cannot reference columns from the outer query. A lateral subquery can.
That means this part:
WHERE user_id = users.user_id
runs independently for each candidate user.
For every candidate, the database selects that user's most relevant public impressions under the requested perspective.
perspective prevents identity leakage
AND perspective = $2
If the user is being searched as a professional, only professional impressions are used. If the user is being searched as a buyer, only buyer impressions are used.
This prevents a user's hiring preferences from affecting their ranking as a freelancer, and vice versa. The is_public check here serves as the second layer, ensuring that only explicitly public impression content enters the final result.
distance <= 0.55 is the second quality gate
A user may enter the top 50 because of relevant tags.
But if none of their actual impression texts are close enough to the query, the lateral join returns no rows for that user.
Because this is a LATERAL subquery joined with ON true, if no impressions match, the subquery produces no rows and the candidate is naturally filtered out.
This gives the search engine two quality gates:
- Tag-level semantic recall.
- Impression-level semantic verification.
LIMIT 10 keeps the results agent-friendly
The backend returns at most 10 relevant impressions per candidate.
That is enough context for the agent to explain the match and draft a proposal, but not so much that the agent's context becomes noisy.
The impressions are collected into a Markdown-style list:
string_agg('- ' || content, E'\n')
So the final impressions field looks like this:
- This developer has experience designing backend systems for early-stage SaaS products.
- They prefer collaboration where technical trade-offs are discussed explicitly.
- They value maintainable interfaces and strict type definitions.
This format is intentionally simple. The consumer is an AI agent, not a frontend table component. A compact Markdown list is easy for the agent to read and reason over.
Step 8: Return an agent-readable payload
The server then returns:
id_column = "professional_id" if perspective == "Professional" else "buyer_id"
return [
{
"name": row["name"],
"badge": row["badge"],
"candidate_id": row[id_column],
"description": row["description"],
"impressions": row["string_agg"],
}
for row in rows
]
The final payload contains:
| Field | Purpose |
|---|---|
name |
Display name |
badge |
Subscription or trust-related marker |
candidate_id |
Identity-specific contact ID |
description |
General user description |
impressions |
Up to 10 query-relevant impression statements |
The agent can now decide which candidates are worth showing to the user.
It can say:
Candidate A looks relevant because their impressions mention early-stage SaaS architecture, backend reliability, and collaboration with technical founders.
Then, if the user wants to contact the candidate, the agent drafts:
- a collaboration proposal
- potential benefits for the candidate
After user confirmation, the agent calls:
ai_contact_candidate(access_token, candidate_id, proposal, benefits)
So the backend is not responsible for writing persuasive messages. It is responsible for retrieving semantically dense, high-signal candidate context.
In other words:
The server finds the right people.
The agent says the right thing.
Handling two embedding models
The production code also supports a two-model mode:
elif len(embedding_models) == 2:
embeddings = await vectorize_contents(embedding_models[0], [query])
other_embeddings = await vectorize_contents(embedding_models[1], [query])
other_embedding_column = (
"even_embedding"
if embedding_column == "odd_embedding"
else "odd_embedding"
)
The schema has two embedding columns:
odd_embedding vector(1536),
even_embedding vector(1536)
with a constraint that exactly one is present:
CONSTRAINT check_odd_or_even_embedding
CHECK ((odd_embedding IS NULL) <> (even_embedding IS NULL))
When two embedding models are configured, the query is embedded with both models. The server searches both vector columns:
for embedding, column in [
(embeddings[0], embedding_column),
(other_embeddings[0], other_embedding_column),
]:
query = f"""
SELECT tag_id, distance
FROM (
SELECT tag_id, ({column} <=> $1::vector) AS distance
FROM tags
) AS tag_distances
WHERE distance <= 0.55
ORDER BY distance ASC
LIMIT 100
"""
The scoring logic remains the same. Matches from both embedding spaces contribute to the same record_scores dictionary.
The impression reranking query also searches both columns using UNION ALL:
SELECT content, ({embedding_column} <=> $3::vector) AS distance
FROM impressions
WHERE user_id = users.user_id
AND perspective = $2
AND is_public = TRUE
UNION ALL
SELECT content, ({other_embedding_column} <=> $4::vector) AS distance
FROM impressions
WHERE user_id = users.user_id
AND perspective = $2
AND is_public = TRUE
Then the combined results are filtered:
WHERE distance <= 0.55
ORDER BY distance ASC
LIMIT 10
This makes the search function tolerant of data encoded with either embedding model.
That is useful when rotating embedding models, migrating old vectors, or operating with more than one embedding source during a transition period.
Why not search all impressions globally?
A simpler version of this system would be:
Embed the query and run a global vector search over all impressions.
That sounds attractive, but it has drawbacks.
Impressions are longer and more numerous than tags. A single user may have many impressions, and each impression is a rich semantic statement. Searching the entire impression table globally would make the expensive part of the pipeline happen too early.
Instead, this system uses a two-stage approach:
The tag layer is a lightweight proxy for semantic recall.
The impression layer is used only after the candidate set has been reduced to a small number of users.
This gives the system a useful split:
| Stage | Purpose | Design |
|---|---|---|
| Tag recall | High recall, low cost | HNSW vector search over tags |
| User scoring | Rank likely candidates | Sum of similarity³
|
| Impression rerank | High precision | Per-user semantic filtering with LATERAL JOIN
|
This is not just a performance optimization. It also improves the quality of results.
Tags help the system identify relevant candidates. Impressions help the system explain why they are a good fit.
Relevant indexes
Here are the indexes most relevant to this search path.
For tag vector search:
CREATE INDEX i_tags_odd_embedding
ON tags
USING hnsw (odd_embedding vector_cosine_ops)
WITH (m = 32, ef_construction = 128);
CREATE INDEX i_tags_even_embedding
ON tags
USING hnsw (even_embedding vector_cosine_ops)
WITH (m = 32, ef_construction = 128);
For active-user filtering:
CREATE INDEX i_logins_updated_at_user_id
ON logins (updated_at, user_id);
For public tag-to-user lookup:
CREATE INDEX i_impression_tags_tag_user_public
ON impression_tags (tag_id, user_id)
WHERE is_public IS TRUE;
For per-user impression lookup:
CREATE INDEX i_impressions_user_perspective
ON impressions (user_id, perspective);
The impressions table is also partitioned by user_id:
CREATE TABLE impressions (
impression_id BIGINT GENERATED BY DEFAULT AS IDENTITY,
user_id BIGINT NOT NULL,
perspective VARCHAR(50) NOT NULL,
content TEXT NOT NULL,
odd_embedding vector(1536),
even_embedding vector(1536),
is_public BOOLEAN NOT NULL,
updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (user_id, impression_id)
) PARTITION BY RANGE (user_id);
This helps keep per-user impression retrieval predictable as the table grows.
The full mental model
Here is the full search pipeline, step by step:
- The agent sends a natural-language query.
- Embed the query into a vector.
- Find semantically close tags using pgvector cosine distance.
- Keep tags with distance <= 0.55, limiting to 100 tags.
- Reverse-lookup users via the impression_tags join table.
- Filter to include only public impressions.
- Filter to include only users active in the last month.
- Compute each user's score as the sum of similarity³.
- Take the top 50 users.
- Exclude the current user.
- For each remaining user, run a LATERAL JOIN against their impressions.
- Keep only impressions matching the requested perspective (Buyer or Professional).
- Keep only impressions with distance <= 0.55.
- Return up to 10 impressions per user.
- The agent receives the results, recommends candidates to the user, and drafts collaboration proposals.
Engineering takeaways
Here are the main design lessons from building this.
1. AI-agent APIs should return reasoning context, not just records
A normal search API might return IDs and display fields. An AI-agent search API should return evidence. That is why the response includes query-relevant impressions. The agent needs them to explain the match and write a personalized message.
2. Semantic tags are a useful middle layer
Raw keyword search is too brittle. Global impression vector search can be too expensive too early. Semantic tags give the system a compact recall layer between keywords and full documents.
3. Weak semantic matches should not dominate rankings
Linear scoring makes it easier for many weak matches to overpower fewer strong matches. Cubic scoring is a simple way to let strong matches dominate without completely discarding weaker supporting signals.
4. Separate identity perspectives matter
The same person can be a buyer in one context and a professional in another. Mixing these impressions would create strange matches. Keeping buyer_id, professional_id, and perspective separate makes the search context cleaner.
5. Auth failure should be part of the function contract
Returning None for expired authentication lets the agent recover by running the sign-in process. Returning False for other failures tells the agent not to blindly retry. For agent workflows, these distinctions matter.
Closing
Opportunity Skill is built around a simple belief:
In the AI-agent era, your profile should not only be readable by humans. It should be searchable, interpretable, and actionable by agents.
The search function described here is one part of that system. It turns a natural-language request into:
- semantically matched candidates
- identity-aware candidate IDs
- relevant profile evidence
- compact context that an AI agent can use to draft a proposal
If you want to try the skill, you can ask your agent to install it from:
https://github.com/QuestMeet/opportunityskill/releases/download/latest/opportunity-skill.zip


Top comments (0)