From Text File to Smart Search: Building RAG with Gemini in Minutes

#webdev #ai #gemini #tutorial

Retrieval-Augmented Generation (RAG) has always carried a bit of a reputation: powerful once it works, but often intimidating at the setup stage. Typically, you find yourself juggling vector stores, embedding models, chunking strategies, and a host of other moving parts. Google’s File Search tool in the Gemini API changes that dynamic entirely. It offers a managed RAG pipeline without requiring you to build and maintain the underlying infrastructure.

This guide walks you through the full journey—from installation to executing your first query—using simple steps and a complete working script.

Before You Begin

To follow along smoothly, you will need:

Python 3.8 or later
A Gemini API key from Google AI Studio
Comfort with running Python files from the terminal

Once these pieces are ready, the entire process becomes remarkably straightforward.

Step 1: Install the Required Package

Google bundles everything you need inside a single Python package. Install it by running:

pip install google-genai

This gives you direct access to the Gemini models as well as the File Search tooling.

Step 2: Generate Your API Key

Visit Google AI Studio and create a new API key. This key will authenticate your requests. Make sure to store it securely and update it inside the script later, where indicated.

Step 3: Prepare a Sample Document

For demonstration purposes, create a file named sample.txt and place it in the same directory as your Python script. Here is a new example based on Stephen Hawking’s well-known science book:

A Brief History of Time is a popular-science book written by Stephen Hawking and first 
published in 1988. The book explores fundamental questions about the universe, including 
the nature of space and time, the origin of the cosmos, and the behavior of black holes.

Hawking presents complex scientific ideas in a way that general readers can follow, 
introducing concepts such as the Big Bang, singularities, and quantum mechanics. 
The narrative blends scientific explanation with Hawking's reflections on humanity’s 
attempt to understand the structure of the universe.

This serves as the source document for the RAG example.

Step 4: Full Working Script

Below is the complete Python code that showcases the File Search workflow end-to-end. It creates a File Search store, uploads your document, waits for processing, sends a question, prints the results, and finally deletes the temporary store.

Create a file named file_search_demo.py:

from google import genai
from google.genai import types
import time

client = genai.Client(api_key="YOUR_API_KEY_HERE")

print("Starting File Search test...\n")

print("1. Creating File Search store...")
file_search_store = client.file_search_stores.create(
    config={'display_name': 'test-file-search-store'}
)
print(f"   Created store: {file_search_store.name}\n")

print("2. Uploading and importing sample.txt file...")
operation = client.file_search_stores.upload_to_file_search_store(
    file='sample.txt',
    file_search_store_name=file_search_store.name,
    config={
        'display_name': 'A Brief History of Time - Sample',
    }
)

print("   Waiting for import to complete...")
while not operation.done:
    time.sleep(2)
    operation = client.operations.get(operation)

print("   Import completed!\n")

print("3. Querying the file with a question about Stephen Hawking...")
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Tell me about Stephen Hawking and what he discusses in A Brief History of Time.",
    config=types.GenerateContentConfig(
        tools=[
            types.Tool(
                file_search=types.FileSearch(
                    file_search_store_names=[file_search_store.name]
                )
            )
        ]
    )
)

print("Response from Gemini API:")
print("-" * 50)
print(response.text)
print("-" * 50)

print("\nCitation metadata:")
if response.candidates and response.candidates[0].grounding_metadata:
    print(response.candidates[0].grounding_metadata)
else:
    print("No grounding metadata available")

print("\n4. Cleaning up - deleting File Search store...")
client.file_search_stores.delete(
    name=file_search_store.name,
    config={'force': True}
)
print(f"   Deleted store: {file_search_store.name}")

print("\nTest completed successfully!")

Add your screenshot underneath this section when publishing the blog.

Step 5: Run the Script

Once your API key is in place and the sample file is saved:

python file_search_demo.py

The script will create the store, upload your document, run the query, and return a grounded response based on your file’s contents. You will also see metadata and confirmation messages indicating each step of the workflow.

What Happens Inside File Search

Although the developer experience is simple, the system behind the scenes is doing a lot of heavy lifting. When you upload a document, it is automatically divided into manageable segments. These pieces are then transformed into embeddings that capture semantic meaning. The File Search Store acts as a managed vector database, allowing the system to identify which segments best match a query.

When you ask a question, your prompt undergoes the same embedding process. The API retrieves the most relevant chunks based on similarity and feeds them into the model as context. This ensures that the generated answer is grounded in your documents rather than being constructed from general knowledge alone.

Supported File Types

The tool works with a wide range of formats: text, PDF, Word documents, spreadsheets, presentations, and even code files in languages such as Python, JavaScript, Java, and C++. Individual files can go up to 100 MB, which is more than enough for most document-centric applications.

Going Further: Chunking and Metadata

If you want more influence over how documents are processed, File Search provides options for custom chunking. Here is an example:

operation = client.file_search_stores.upload_to_file_search_store(
    file_search_store_name=file_search_store.name,
    file='sample.txt',
    config={
        'chunking_config': {
            'white_space_config': {
                'max_tokens_per_chunk': 200,
                'max_overlap_tokens': 20
            }
        }
    }
)

You can also attach metadata to documents and filter results based on those attributes:

custom_metadata=[
    {"key": "author", "string_value": "Stephen Hawking"},
    {"key": "year", "numeric_value": 1988}
]

And later query only documents matching those filters.

Where File Search Shines

This setup is ideal whenever information lives across multiple files: support documentation, legal resources, research articles, engineering notes, or educational materials. File Search allows you to build systems that can understand and respond to questions by locating exactly the relevant parts of your content.

Conclusion

What makes File Search compelling is its minimal setup. Developers can get a functioning RAG system running in minutes without assembling a backend stack of vector databases or embedding systems. By abstracting away infrastructure concerns, the Gemini API lets you focus on solving real problems—whether you are building a search assistant, a research companion, or an internal tool for your team.

If you want to explore the full code example and sample files, you can find them at the repository linked in the original content. Once you understand the basics here, you can experiment with different documents and create more advanced applications on top of this managed RAG pipeline.

This approach marks a shift toward simpler, more accessible document intelligence. With File Search, constructing a grounded, context-aware system becomes an achievable task for any developer familiar with Python.