Have you ever wanted to ask questions about a PDF document without uploading it to a third-party service? Whether it's a dense research paper, a lengthy legal document, or a company report, interacting with your files through a conversational AI can be a game-changer.
Today, we'll build a command-line application that lets you do just that. We will combine the power of Google's Genkit, an open-source AI framework, with Gaia, a platform for running local, privacy-preserving AI models.
By the end of this tutorial, you'll have a script that can ingest any PDF and let you chat with it directly from your terminal, ensuring your data never leaves your machine.
This tutorial is inspired by the official Genkit documentation, which you can find here: Chat with a PDF Tutorial.
The Tech Stack: Genkit + Gaia
Before we dive into the code, let's briefly introduce our main tools.
What is Genkit?
Genkit is an open-source framework from Google designed to simplify the development of production-ready AI applications. Think of it as the "Express.js" or "Next.js" for the AI world. It provides the tools to build, test, and run AI flows, manage prompts, and switch between different models (like OpenAI's, Google's Gemini, or, in our case, a local one). Its modularity makes it incredibly powerful for creating robust AI-powered features.
What is Gaia?
Gaia is a decentralized network that lets anyone run powerful AI models (LLMs) on their own hardware. The Gaia Node provides an OpenAI-compatible API, which means you can connect to your local, private models as easily as you would connect to ChatGPT. This is perfect for privacy-sensitive tasks, offline usage, and avoiding high API costs.
For this project, Genkit will orchestrate our AI flow, and Gaia will provide the private, local brainpower.
Step 1: Project Setup
First, let's set up our Node.js project. Open your terminal and run the following commands.
-
Create a project directory:
mkdir genkit-gaia-pdf-chat cd genkit-gaia-pdf-chat
-
Initialize a Node.js project:
npm init -y
-
Install the necessary dependencies:
We needgenkit
and its OpenAI compatibility plugin,pdf-parse
to read the PDF, anddotenv
to manage our configuration.
npm install genkit @genkit-ai/compat-oai pdf-parse dotenv
-
Install development dependencies for TypeScript:
We'll usetsx
to run our TypeScript code directly.
npm install -D typescript tsx @types/node
-
Create a source directory and our main file:
mkdir src touch src/index.ts
Step 2: Configure Your Environment
To keep our code clean and secure, we'll store our Gaia Node configuration in an environment file.
Create a file named .env
in the root of your project:
# .env
# Gaia Node Configuration
GAIA_BASE_URL="https://0x34d996c47b2c00e39b37dfd70aac5beaaf25d2c8.gaia.domains/v1"
GAIA_API_KEY="gaia"
GAIA_MODEL_NAME="Qwen3-4B-Q5_K_M"
-
GAIA_BASE_URL
: The URL of your running Gaia Node. -
GAIA_API_KEY
: The API key for your node (usually"gaia"
by default). -
GAIA_MODEL_NAME
: The name of the model you have downloaded and are running on your node.
If you don't have a Gaia Node running yet, follow the simple instructions in the official Gaia documentation.
Step 3: Writing the Chat Application Code
Now for the fun part! Open src/index.ts
and paste the following code. We'll break down what each part does below.
// src/index.ts
import { genkit, modelRef, z } from 'genkit/beta';
import { openAICompatible } from '@genkit-ai/compat-oai';
import { PDFParse } from 'pdf-parse';
import fs from 'fs';
import { createInterface } from 'node:readline/promises';
import * as dotenv from 'dotenv';
// Load environment variables from .env file
dotenv.config();
// 1. Initialize Genkit and configure the Gaia Node plugin
const ai = genkit({
plugins: [
openAICompatible({
name: 'my-custom-llm',
apiKey: process.env.GAIA_API_KEY || 'gaia',
baseURL: process.env.GAIA_BASE_URL,
}),
],
});
// 2. Create a reference to the local model
export const myLocalModel = modelRef({
name: `my-custom-llm/${process.env.GAIA_MODEL_NAME}`,
});
// Main application logic wrapped in an async function
(async () => {
let parser;
try {
// 3. Get the PDF filename from command-line arguments
const filename = process.argv[2];
if (!filename) {
console.error('Please provide a filename as a command line argument.');
process.exit(1);
}
// 4. Load and parse the PDF file to extract text
let dataBuffer = fs.readFileSync(filename);
parser = new PDFParse({ data: dataBuffer });
const data = await parser.getText();
const text = data.text;
// 5. Construct the system prompt for the model
const prefix = "You are an expert assistant. Answer the user's questions based on the content of the provided PDF file.";
const prompt = `
${prefix}
Context:
${text}
`;
// 6. Start the chat session with Genkit
const chat = ai.chat({ system: prompt, model: myLocalModel });
const readline = createInterface(process.stdin, process.stdout);
// 7. Add a graceful exit handler for Ctrl+C
readline.on('SIGINT', () => {
console.log('\nGoodbye!');
readline.close();
process.exit(0);
});
console.log("You're chatting with Qwen (via Gaia). Ctrl-C to quit.\n");
// 8. Start the interactive chat loop
while (true) {
const userInput = await readline.question('> ');
const { text } = await chat.send(userInput);
console.log(text);
}
} catch (error) {
if (error.code !== 'ABORT_ERR') {
console.error('An error occurred:', error);
}
} finally {
// Free up memory used by the PDF parser
if (parser) {
await parser.destroy();
}
}
})();
Code Breakdown
- Initialize Genkit: We configure
genkit
to use theopenAICompatible
plugin, feeding it the URL and API key for our Gaia Node from the.env
file. - Model Reference: We create a reference to the specific model we want to use on our node.
- Get Filename: The script grabs the PDF file path you provide as a command-line argument.
- Parse PDF: We use the
fs
module to read the PDF into a buffer and thepdf-parse
library to extract its raw text content. - Construct Prompt: This is the key to retrieval-augmented generation (RAG). We create a "system prompt" that instructs the LLM on its role and provides the entire text of the PDF as
Context
. - Start Chat:
ai.chat()
initializes a new conversational session with our context-loaded prompt. - Graceful Exit: This small but important piece of code listens for a
CTRL+C
signal to close the application cleanly. - Chat Loop: An infinite
while
loop waits for your input, sends it to the chat session viachat.send()
, and prints the model's response.
Step 4: Chat with Your PDF!
You are now ready to run the application. Grab a PDF file, place it in your project directory (or provide a path to it), and run the following command in your terminal:
# Replace 'presentation.pdf' with the name of your file
npx tsx src/index.ts presentation.pdf
Your terminal will spring to life, and you can start asking questions.
This project perfectly illustrates the power of a modern AI development stack:
- Genkit provided the high-level framework to structure our AI logic cleanly.
- Gaia gave us instant, private access to a powerful LLM with a simple, standard API.
From here, the possibilities are endless. You could adapt this script into a web application, add support for more document types, or experiment with different models on your Gaia Node. Happy building
Top comments (0)