Safal Bhandari

Posted on Jul 11

🚀 How to Use Pinecone with Node.js to Store Candidate Embeddings

If you’re building a modern application that relies on semantic search, recommendations, or matching systems—for instance, a job portal or talent platform—you’ll need a way to store and search high-dimensional vectors (embeddings). Pinecone is a powerful vector database built precisely for this task.

In this post, I’ll show you how to:

✅ Generate embeddings from candidate profiles using Google’s Gemini API
✅ Store those embeddings in Pinecone for fast retrieval
✅ Integrate this process into an Express.js backend

We’ll walk through the real working code behind a Candidate Registration API.

Why Store Embeddings?

Let’s say you’re building a platform where:

Candidates upload their resumes
Employers post job descriptions
You want to recommend matching candidates for each job posting

Instead of relying purely on keyword matching, you convert text data into embeddings—a mathematical representation of text. These embeddings allow you to find semantically similar documents quickly.

Pinecone is a specialized database that stores and searches these vectors efficiently at scale.

Tech Stack

Node.js + Express — Our backend server
Prisma — Database ORM for PostgreSQL/MySQL
Google Gemini API — To generate text embeddings
Pinecone — Vector database to store and query embeddings

The Flow

Here’s how the full flow works in the code you provided:

Candidate completes registration
You save their info in your relational DB (via Prisma)
You generate embeddings for their skills & responsibilities
You upsert those vectors into Pinecone

Let’s look at the code that makes it happen.

1. Setting up Pinecone

Install the Pinecone client:

npm install @pinecone-database/pinecone

Then initialize Pinecone like this:

// services/pinecone.ts
import { Pinecone } from "@pinecone-database/pinecone";

export const pinecone = new Pinecone({
  apiKey: process.env.PINECONE_API_KEY!,
});

export const pineconeIndex = pinecone.index(process.env.PINECONE_INDEX_NAME!);

We also define a type to represent vectors:

export interface MyVector {
  id: string;
  values: number[];
  metadata?: Record<string, any>;
}

2. Generating Embeddings with Gemini

Next, let’s generate embeddings from text. For example, we might embed this:

"JavaScript, Node.js, Express, REST API, Agile development"

Your code uses Google Gemini for this:

import { GoogleGenAI } from "@google/genai";

const gemini = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY!,
});

export async function getEmbedding(text: string): Promise<number[]> {
  const result = await gemini.models.embedContent({
    model: "text-embedding-004",
    contents: text,
  });
  const emb = result.embeddings;
  if (!emb || emb.length === 0 || !emb[0].values) {
    throw new Error("No embeddings returned from Gemini API");
  }
  console.log(`Embedding dimension: ${emb[0].values.length}`);
  return emb[0].values;
}

This returns a high-dimensional vector representation of your text.

3. Candidate Registration Flow

When a user completes registration, we:

Save their info to the database
Generate embeddings for their profile
Upsert those embeddings into Pinecone

Here’s the relevant controller:

export class CandidateController {
  static async completeRegistration(req: AuthRequest, res: Response) {
    const validatedData = candidateRegistrationSchema.parse(req.body);

    const user = await prisma.user.findUnique({
      where: { id: req.user?.userId },
      include: { candidate: true },
    });

    if (user?.candidate) {
      return res.status(400).json({
        success: false,
        message: "User has already filled the details",
      });
    }

    // Save candidate + profile data
    const result = await prisma.$transaction(async (prisma) => {
      const candidateResult = await prisma.candidate.create({
        data: {
          userId: req.user?.userId!,
          about: validatedData.about,
          location: validatedData.location,
        },
      });

      await prisma.education.createMany({
        data: validatedData.education.map((ed) => ({
          educationId: candidateResult.userId,
          name: ed.name,
          degree: ed.degree,
          startYear: ed.startYear,
          endYear: ed.endYear,
        })),
      });

      const profileEntries = [];
      for (const prf of validatedData.profile) {
        const entry = await prisma.profile.create({
          data: {
            profileid: candidateResult.userId,
            title: prf.title,
            skills: prf.skills,
            experienceYear: prf.experienceYear,
            responsibilitiesHandled: prf.responsibilitiesHandled,
          },
        });
        profileEntries.push(entry);
      }

      await prisma.user.update({
        where: { id: req.user?.userId! },
        data: { isRoleDefined: true },
      });

      return { candidateResult, profileEntries };
    });

    const { candidateResult, profileEntries } = result;

    // Generate embeddings for each profile entry
    const upsertPayload: MyVector[] = await Promise.all(
      profileEntries.map(async (pf) => {
        const skillsText = pf.skills.join(", ");
        const responsibilitiesText = pf.responsibilitiesHandled.join(", ");
        const fullText = skillsText + responsibilitiesText;

        const embeddings = await getEmbedding(fullText);
        return {
          id: pf.id,
          values: embeddings,
          metadata: {
            userId: candidateResult.userId,
          },
        };
      })
    );

    // Upsert to Pinecone
    await upsertVectors(upsertPayload);

    return res.status(200).json({
      success: true,
      message: "Candidate Registration completed Successfully",
    });
  }
}

4. Upserting Embeddings into Pinecone

Finally, we store embeddings in Pinecone:

export async function upsertVectors(
  vectors: MyVector[],
  namespace?: string
) {
  console.log(vectors);

  const pineconeRecords = vectors.map((vector) => ({
    id: vector.id,
    values: vector.values,
    metadata: vector.metadata,
    ...(namespace ? { namespace } : {}),
  }));

  const response = await pineconeIndex.upsert(pineconeRecords);
  return response;
}

This makes your vectors instantly searchable for future queries like:

Finding similar profiles
Matching candidates to job descriptions
Building recommendation engines

How It Looks in Practice

When your candidate registers, your logs might show:

Embedding dimension: 768
[
  {
    id: 'profile123',
    values: [0.123, 0.445, …],
    metadata: { userId: 'user456' }
  }
]
Upsert completed!

Boom. You now have a vector database of candidate profiles.

Why Use Pinecone?

✅ Scalable storage of millions of vectors
✅ Millisecond search latency
✅ Easy integration via official SDKs
✅ Hosted and managed

Compared to storing vectors directly in your relational DB (e.g. Postgres), Pinecone is designed for fast vector similarity search and high-dimensional data at scale.

Wrapping Up

Congrats! You now have:

Candidate profiles stored in your relational DB
Embeddings generated from text via Gemini
Vectors upserted into Pinecone for blazing-fast similarity search

This architecture can power:

Job/candidate matching
Personalized recommendations
Smart search features

Feel free to grab the code above as a starting point for your own project.

Happy building! 🚀

DEV Community