DEV Community

Cover image for I built the missing UI for Gemini's File Search (managed RAG) API
Prashant Rohilla
Prashant Rohilla

Posted on

I built the missing UI for Gemini's File Search (managed RAG) API

Retrieval Augmented Generation (RAG) has become the standard architecture for building AI apps that know about your specific data.

Usually, building a RAG pipeline involves a lot of moving parts: spinning up a vector database (like Pinecone or Weaviate), writing Python scripts to chunk your PDFs, generating embeddings, and managing the retrieval logic.

Google recently released a feature called Gemini File Search that simplifies this drastically.

It is a fully managed RAG pipeline built directly into the Gemini API. It handles the chunking, embedding, and storage for you. To top it, the pricing model is arguably the most compelling feature.

Unlike traditional vector databases where you often pay for hourly pod usage or storage size, Gemini File Search offers free storage and free embedding generation at query time.

You only pay a one-time fee when you first index the documents (currently $0.15 per 1 million tokens) and the standard input/output token costs for the model's response. This makes it incredibly cost-effective for side projects and scaling applications alike, as you are not bleeding money on idle vector storage.

But there is a catch.

It is completely "headless". There is no console to view your files, no drag-and-drop uploader, and no way to test a knowledge base without writing a script to do it. If you want to delete a file or check if your chunking strategy is working, you have to write code.

I got tired of writing throwaway scripts just to manage my knowledge bases, so I built the Gemini File Search Manager.

Gemini File Search Manager Dashboard

What is Gemini File Search Manager?

This is an open-source, local-first web interface that acts as a control plane for the Gemini File Search API. It is built with Next.js and allows you to manage your RAG pipeline visually.

You can check out the code and run it locally here:
👉 https://github.com/prashantrohilla-max/gemini-file-search-manager

Here is why I built it and the problems it solves

1. Visualizing the "Black Box"

When you use the File Search API programmatically, you are often flying blind. You create a "Store," upload a file, and hope it processed correctly.

I wanted a dashboard where I could see exactly what Stores I have active, how many documents are in them, and their ingestion status. The UI lists all your stores and provides a clear view of your active knowledge bases—including counts for active, pending, and failed documents.

2. Drag-and-Drop Ingestion (No more scripts)

Uploading a file via the API is a multi-step process. You have to upload the file bytes, wait for the operation to complete, and then link it to a store.

I built a drag-and-drop interface that handles this orchestration for you. It supports PDF, TXT, MD, CSV, JSON, and the 100+ other formats Gemini supports.

Upload Interface

One of the most powerful features of the Gemini API is Custom Chunking and Metadata. Usually, configuring these requires constructing complex JSON objects in your code. I added a UI for this directly in the upload flow. You can now easily experiment with different maxTokensPerChunk and maxOverlapTokens settings, or add metadata tags (like author or year) to filter your searches later.

3. The RAG Playground

This is the most important feature. Once your data is uploaded, you need to know if the model can actually find it.

I built a dedicated "Playground" view for every Store. It allows you to:

  • Chat with your specific documents using conversational history.
  • Select different models (Gemini 3 Pro Preview, Gemini 3 Flash, Gemini, Gemini 2.5 Pro, 2.5 Flash, or Gemini 2.0 Flash).
  • Filter by metadata using AIP-160 syntax (e.g., author = "Smith" AND year > 2020).
  • View Citations: The UI parses the groundingMetadata from the API response and displays exactly which document chunks the AI used to generate the answer.

RAG Playground

Under the Hood: The Tech Stack

For those interested in how this is built, I kept the stack modern and lightweight:

  • Framework: Next.js 16 (App Router with React 19)
  • Styling: Tailwind CSS 4 + shadcn/ui
  • State Management: TanStack Query
  • SDK: @google/genai

Solving the Async Polling Challenge

One interesting technical challenge was handling the file ingestion status. When you upload a file to Gemini, it doesn't become active immediately. The API returns an Operation object, and the file enters a PROCESSING state.

To handle this in the UI without freezing the browser, I implemented a polling mechanism. After the initial upload completes, the app polls the operations endpoint every 3 seconds in the background. Once the embeddings are ready, it automatically invalidates the cache and updates the UI—the document status changes from a spinner to a green checkmark.

For the document list itself, TanStack Query handles background refetching every 5 seconds to catch any status changes.

Streaming Chat Responses

The chat playground uses Server-Sent Events (SSE) to stream responses in real-time. As the model generates text, it appears character by character in the UI. When the stream completes, the grounding metadata (citations) is extracted and displayed below the response.

Security

Since this is a tool for developers, I didn't want to deal with user accounts or databases. The app runs locally and uses your environment variables.

You simply create a .env.local file with your GEMINI_API_KEY, and the app uses that on the server side. Your key never leaves your machine and is never exposed to the client browser.

Quick Start

If you want to try this out, you can have it running in about 2 minutes.

1. Clone the repo

git clone https://github.com/prashantrohilla-max/gemini-file-search-manager
cd gemini-file-search-manager
npm install
Enter fullscreen mode Exit fullscreen mode

2. Add your API Key

Create a .env.local file:

GEMINI_API_KEY=your_key_here
Enter fullscreen mode Exit fullscreen mode

3. Run it

npm run dev
Enter fullscreen mode Exit fullscreen mode

Open http://localhost:3000, and you are ready to go.

What's Next?

I am planning to add support for Structured Outputs to test data extraction workflows and potentially a feature to import content directly from URLs.

Currently, chat sessions are lost if you navigate away from the playground or stop the server; persistent chat sessions are also on the roadmap.

This project is open source and MIT licensed. If you find it useful, please give it a star on GitHub or submit a PR if you want to improve it!

Repo: https://github.com/prashantrohilla-max/gemini-file-search-manager

Top comments (0)