DEV Community

Evan Lin for Google Developer Experts

Posted on

[Gemini 3.0][Google Search] Using the Google Search Grounding API with Gemini 3.0 Pro to Build a News and Information Assistant

image-20251211093304707

Background

When developing a LINE Bot, I wanted to improve the plain text search function: allowing users to input any question, and the AI could automatically search the internet for information and organize the answers, while also supporting continuous conversations. The traditional method required connecting multiple APIs (Gemini to extract keywords → Google Custom Search → Gemini to summarize), which was not only slow (3 API calls) but also lacked conversation memory.

However, Google launched the Grounding with Google Search feature in 2024, which is the official RAG (Retrieval-Augmented Generation) solution, allowing the Gemini model to automatically search the internet and cite sources, and natively supports Chat Session! This feature is provided through Vertex AI, so that AI responses are no longer based on imagination, but on real internet information.

Screen Display

LINE 2025-12-11 09.29.52

(The results using the old Google Custom Search)

You will find that the results are based on Google Search.

Main Repo https://github.com/kkdai/linebot-helper-python

Problems Encountered During Development

Problem 1: Bottlenecks in the Old Implementation

When implementing loader/searchtool.py, I used the traditional search process:

# ❌ Old method - 3 API calls
async def handle_text_message(event, user_id):
    msg = event.message.text

    # 1st: Extract keywords
    keywords = extract_keywords_with_gemini(msg, api_key)

    # 2nd: Google Custom Search
    results = search_with_google_custom_search(keywords, search_api_key, cx)

    # 3rd: Summarize results
    summary = summarize_text(result_text, 300)

    # Return results...

Enter fullscreen mode Exit fullscreen mode

This method has several obvious problems:

❌ No conversation memory - Each time is a new conversation, unable to ask questions continuously

User: "What is Python?"
Bot: [Search results + Summary]

User: "What are its advantages?" # ❌ Bot doesn't know "it" refers to Python

Enter fullscreen mode Exit fullscreen mode

❌ Shallow search results - Only using snippets, unable to deeply read the content of the webpage

❌ Slow and costly - 3 API calls (~6-8 seconds) + Google Custom Search fees ($0.005/time)

Problem 2: Client Closed Error

When I switched to Vertex AI Grounding, I encountered this error:

ERROR:loader.chat_session:Grounding search failed: Cannot send a request, as the client has been closed.

Enter fullscreen mode Exit fullscreen mode

The reason was that I created a local client variable in the function:

# ❌ Incorrect method - client will be garbage collected
def get_or_create_session(self, user_id):
    client = self._create_client() # Local variable
    chat = client.chats.create(...)
    return chat # The client is closed after the function ends!

Enter fullscreen mode Exit fullscreen mode

After the function ends, client is garbage collected and closed, causing the chat session created based on it to be unusable.

Correct Solution

1. Using Vertex AI Grounding with Google Search

Google Search Grounding is the official RAG solution provided by Vertex AI, compared to the old Custom Search:

Feature Old Version (Custom Search) New Version (Grounding)
Number of API Calls 3 times 1 time
Response Speed ~6-8 seconds ~2-3 seconds
Conversation Memory ❌ No ✅ Native support
Search Quality ⭐⭐⭐ (snippet) ⭐⭐⭐⭐⭐ (complete webpage)
Source Citation Only links Complete citation
Cost Gemini + Custom Search Only Vertex AI

2. Create a Chat Session Manager

First, I created loader/chat_session.py to manage chat sessions:

from google import genai
from google.genai import types
from datetime import datetime, timedelta
from typing import Dict, Tuple, List

class ChatSessionManager:
    def __init__ (self, session_timeout_minutes: int = 30):
        self.sessions: Dict[str, dict] = {}
        self.session_timeout = timedelta(minutes=session_timeout_minutes)

        # ✅ Key: Create a shared client instance (avoid client closed error)
        self.client = self._create_client()

    def _create_client(self) -> genai.Client:
        """Create a Vertex AI client"""
        return genai.Client(
            vertexai=True, # Enable Vertex AI
            project=os.getenv('GOOGLE_CLOUD_PROJECT'),
            location=os.getenv('GOOGLE_CLOUD_LOCATION', 'us-central1'),
            http_options=types.HttpOptions(api_version="v1")
        )

    def get_or_create_session(self, user_id: str) -> Tuple[object, List[dict]]:
        """Get or create the user's chat session"""
        now = datetime.now()

        # Check existing session
        if user_id in self.sessions:
            session_data = self.sessions[user_id]
            if not self._is_session_expired(session_data):
                session_data['last_active'] = now
                return session_data['chat'], session_data['history']

        # Create a new session with Google Search Grounding
        config = types.GenerateContentConfig(
            temperature=0.7,
            max_output_tokens=2048,
            # ✅ Enable Google Search
            tools=[types.Tool(google_search=types.GoogleSearch())],
        )

        # Use the shared self.client (will not be closed)
        chat = self.client.chats.create(
            model="gemini-2.0-flash",
            config=config
        )

        self.sessions[user_id] = {
            'chat': chat,
            'last_active': now,
            'history': [],
            'created_at': now
        }

        return chat, []

Enter fullscreen mode Exit fullscreen mode

Key points for repair:

  1. Shared Client - self.client is created in __init__ (), and the lifecycle is the same as ChatSessionManager
  2. Automatic Expiration - The session automatically expires after 30 minutes
  3. Conversation Isolation - Each user's session is completely independent

3. Implement Search and Answer Functions

Then implement the core function to search and answer using Grounding:

async def search_and_answer_with_grounding(
    query: str,
    user_id: str,
    session_manager: ChatSessionManager
) -> dict:
    """Search and answer questions using Vertex AI Grounding"""
    try:
        # Get or create chat session
        chat, history = session_manager.get_or_create_session(user_id)

        # Build prompt (Traditional Chinese + not using markdown)
        prompt = f"""Please answer the following questions in Traditional Chinese using Taiwanese terminology.
If you need the latest information, please search the internet and provide accurate answers.
Please provide detailed and useful answers, and ensure that the information sources are reliable.
Please do not use markdown format (do not use symbols such as **, ##, -, etc.). Use plain text to answer.

Question: {query}"""

        # Send message (Gemini will automatically decide whether to search)
        response = chat.send_message(prompt)

        # Record to history
        session_manager.add_to_history(user_id, "user", query)
        session_manager.add_to_history(user_id, "assistant", response.text)

        # Extract source citations
        sources = []
        if hasattr(response, 'candidates') and response.candidates:
            candidate = response.candidates[0]
            if hasattr(candidate, 'grounding_metadata'):
                metadata = candidate.grounding_metadata
                if hasattr(metadata, 'grounding_chunks'):
                    for chunk in metadata.grounding_chunks:
                        if hasattr(chunk, 'web'):
                            sources.append({
                                'title': chunk.web.title,
                                'uri': chunk.web.uri
                            })

        return {
            'answer': response.text,
            'sources': sources,
            'has_history': len(history) > 0
        }

    except Exception as e:
        logger.error(f"Grounding search failed: {e}")
        raise

Enter fullscreen mode Exit fullscreen mode

Key features:

  • ✅ Gemini automatically determines when to search
  • ✅ Read the complete webpage content (not just snippets)
  • ✅ Automatically extract source citations
  • ✅ Support continuous conversations (remember context)

4. Integrate into main.py

Integrate the Grounding function in main.py:

from loader.chat_session import (
    ChatSessionManager,
    search_and_answer_with_grounding,
    format_grounding_response,
    get_session_status_message
)

# Initialize Session Manager
chat_session_manager = ChatSessionManager(session_timeout_minutes=30)

async def handle_text_message(event: MessageEvent, user_id: str):
    """Handle plain text messages - using Grounding"""
    msg = event.message.text.strip()

    # Special instructions
    if msg.lower() in ['/clear', '/清除']:
        chat_session_manager.clear_session(user_id)
        reply_msg = TextSendMessage(text="✅ Conversation has been reset")
        await line_bot_api.reply_message(event.reply_token, [reply_msg])
        return

    if msg.lower() in ['/status', '/狀態']:
        status_text = get_session_status_message(chat_session_manager, user_id)
        reply_msg = TextSendMessage(text=status_text)
        await line_bot_api.reply_message(event.reply_token, [reply_msg])
        return

    # Use Grounding to search and answer
    try:
        result = await search_and_answer_with_grounding(
            query=msg,
            user_id=user_id,
            session_manager=chat_session_manager
        )

        response_text = format_grounding_response(result, include_sources=True)
        reply_msg = TextSendMessage(text=response_text)
        await line_bot_api.reply_message(event.reply_token, [reply_msg])

    except Exception as e:
        logger.error(f"Error in Grounding search: {e}", exc_info=True)
        error_text = "❌ Sorry, an error occurred while processing your question. Please try again later."
        reply_msg = TextSendMessage(text=error_text)
        await line_bot_api.reply_message(event.reply_token, [reply_msg])

Enter fullscreen mode Exit fullscreen mode

Practical Application Examples

The implemented function is very powerful and can perform intelligent conversations:

Example 1: Basic Question and Answer

User: What is Python?
Bot: Python is a high-level, interpreted programming language created by Guido van Rossum in 1991...

     📚 Reference source:
     1. Python official website
        https://www.python.org/

Enter fullscreen mode Exit fullscreen mode

Example 2: Continuous Conversation (Conversation Memory)

User: What is Python?
Bot: [Answer...]

User: What are its advantages? ✅ Bot knows "it" = Python
Bot: 💬 [In conversation]

     The main advantages of Python include:
     1. Concise and readable syntax
     2. Rich standard library
     ...

Enter fullscreen mode Exit fullscreen mode

Example 3: Latest Information Search

User: Latest earthquake news in Japan
Bot: According to the latest information, Japan on December 2025...
     [Gemini automatically searches the internet and organizes the latest information]

     📚 Reference source:
     1. Central Weather Bureau
     2. NHK News

Enter fullscreen mode Exit fullscreen mode

Usage Scenarios

These application scenarios are particularly suitable for:

  • 💬 Intelligent Customer Service - Automatically search for the latest product information
  • 📰 News Assistant - Track the latest current affairs
  • 🎓 Learning Assistant - Answer questions and provide reliable sources
  • 🔍 Research Assistant - Quickly search and organize information

Environment Setup

Required Environment Variables

# Vertex AI settings (required)
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1" # Optional, defaults to us-central1

# Authentication method (choose one)
# Method 1: Use ADC (development environment)
gcloud auth application-default login

# Method 2: Use Service Account (production environment)
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"

# Enable Vertex AI API
gcloud services enable aiplatform.googleapis.com

Enter fullscreen mode Exit fullscreen mode

Environment Variables No Longer Needed

Due to the switch to Grounding, the following environment variables are no longer needed:

# ❌ No longer needed
# SEARCH_API_KEY=...
# SEARCH_ENGINE_ID=...

Enter fullscreen mode Exit fullscreen mode

This simplifies the configuration and also saves the cost of the Google Custom Search API!

Code Cleanup

Remove Old searchtool Code

Since Grounding is already in use, I performed code cleanup:

  1. main.py - Remove searchtool import
# ❌ Removed
# from loader.searchtool import search_from_text
# search_api_key = os.getenv('SEARCH_API_KEY')
# search_engine_id = os.getenv('SEARCH_ENGINE_ID')

# ✅ Added
logger.info('Text search using Vertex AI Grounding with Google Search')

Enter fullscreen mode Exit fullscreen mode
  1. loader/searchtool.py - Marked as DEPRECATED
"""
⚠️ DEPRECATED: This module is no longer used in the main application.

The text search functionality has been replaced by Vertex AI Grounding
with Google Search, which provides better quality results and native
conversation memory.

This file is kept for reference or as a fallback option.
"""

Enter fullscreen mode Exit fullscreen mode
  1. .env.example and README.md - Remove Custom Search environment variable description

Cleanup Results

Item Before Cleanup After Cleanup
Required Environment Variables 4 2
API Calls 3 times 1 time
Code Complexity High Low
Maintenance Cost High Low

Supported Model List

Currently supported Gemini models for Google Search Grounding:

  • ✅ Gemini 3.0 Pro (Preview) (Powerful)
  • ✅ Gemini 2.5 Pro
  • ✅ Gemini 2.5 Flash
  • ✅ Gemini 2.0 Flash (Recommended)
  • ✅ Gemini 2.5 Flash with Live API
  • ❌ Gemini 2.0 Flash-Lite (Does not support Grounding)

Performance Improvement

Speed Comparison

Metric Old Version New Version Improvement
Number of API Calls 3 times 1 time ⬇️ 66%
Response Time ~6-8 seconds ~2-3 seconds ⬇️ 60%
Search Quality ⭐⭐⭐ ⭐⭐⭐⭐⭐ ⬆️ Significantly improved

Cost Analysis

Old Version Cost (per question and answer):

1. extract_keywords_with_gemini() → Gemini API
2. Google Custom Search → $0.005
3. summarize_text() → Gemini API
                                    ─────────
                                    Total: Gemini + $0.005

Enter fullscreen mode Exit fullscreen mode

New Version Cost (per question and answer):

1. Grounding with Google Search → Vertex AI
                                    ─────────
                                    Total: Only Vertex AI

Enter fullscreen mode Exit fullscreen mode

Save the Custom Search API costFaster response speedHigher search quality

Things to Note Currently

1. Must Use Vertex AI

The Google Search Grounding feature does not support the general Gemini Developer API and must be accessed through Vertex AI.

2. Authentication Settings

  • Development environment: Use gcloud auth application-default login
  • Production environment: Use Service Account and set GOOGLE_APPLICATION_CREDENTIALS

3. Supported Models

Make sure to use a model that supports Grounding (such as gemini-2.0-flash or above), and avoid using the -lite version.

4. Client Lifecycle

Be sure to create a shared client instance in __init__ () to avoid the "client closed" error.

5. Prompt Optimization

In the prompt, clearly indicate:

  • Use Traditional Chinese
  • Do not use markdown format (if plain text is needed)
  • Provide reliable sources

Development Experience

1. Grounding is a Game Changer

From the traditional "keyword extraction → API search → result summary" process to using Grounding's "one API call to complete everything", this transformation brings not only technical simplification, but also a qualitative change in user experience:

Technical Level:

  • ✅ Code volume reduced by 70% (from 3 functions to 1)
  • ✅ API calls reduced by 66% (from 3 times to 1 time)
  • ✅ Response time shortened by 60% (from 6-8 seconds to 2-3 seconds)

User Experience:

  • ✅ Support continuous conversations (finally able to understand what "it" refers to!)
  • ✅ Automatically cite sources (increase credibility)
  • ✅ More in-depth information (complete webpage vs. short snippet)

2. Client Lifecycle Management is Important

The "client closed" error I encountered initially taught me: When using the google-genai SDK, the client should be a long-lived object, not a new one created every time.

# ❌ Error: client will be garbage collected
def create_session():
    client = genai.Client(...)
    chat = client.chats.create(...)
    return chat # client is closed, chat cannot be used

# ✅ Correct: Shared client instance
class Manager:
    def __init__ (self):
        self.client = genai.Client(...) # Create only once

    def create_session(self):
        return self.client.chats.create(...) # Reuse

Enter fullscreen mode Exit fullscreen mode

This lesson applies to all SDKs that need to manage long connections.

3. RAG Doesn't Have to Be Implemented by Yourself

In the past, we needed to implement RAG (Retrieval-Augmented Generation) ourselves:

  1. Use embedding to create a vector database
  2. Implement similarity search
  3. Inject the search results into the prompt
  4. Manage the context window

But Google Search Grounding has already done all of this for us! It:

  • ✅ Automatically determines when to search
  • ✅ Uses Google's search engine (much better than doing it ourselves)
  • ✅ Reads the complete webpage and extracts important information
  • ✅ Automatically cites sources

Conclusion: If your RAG requirement is "search for internet information", just use Grounding, don't reinvent the wheel.

4. Session Management is Simpler Than You Think

When implementing conversation memory, I originally thought I needed:

  • Redis persistence
  • Complex context management
  • Manually maintain conversation history

But in reality, the Gemini Chat API natively supports multi-turn conversations! Just need:

chat = client.chats.create(...)
chat.send_message("Question 1") # Round 1
chat.send_message("Question 2") # Round 2 (automatically remembers Round 1)

Enter fullscreen mode Exit fullscreen mode

I only need to do:

  • Store the chat object in memory
  • Regularly clear expired sessions
  • Provide the /clear command

Simple, efficient, and reliable!

5. The Importance of Prompt Optimization

The initial responses contained a lot of markdown format (**bold**, ## title), which is not aesthetically pleasing when displayed on LINE. Just add a line in the prompt:

prompt = f"""...
Please do not use markdown format (do not use symbols such as **, ##, -, etc.). Use plain text to answer.
Question: {query}"""

Enter fullscreen mode Exit fullscreen mode

The problem is solved! This made me realize: Good prompt design is as important as good code.

6. Learn from Failure

During this development process, I experienced:

  1. ❌ Using Custom Search → Found it too slow and shallow
  2. ✅ Switching to Grounding → But encountered the client closed error
  3. ✅ Fixing the client lifecycle → Found the markdown format problem
  4. ✅ Optimizing the prompt → Perfect!

Each problem is an opportunity to learn. If I had succeeded from the start, I wouldn't have learned so much about SDK design, lifecycle management, and prompt engineering.

Summary

If you are developing an AI application that requires search functionality:

  • Prioritize Grounding - Much simpler than implementing RAG yourself
  • Pay attention to Client Lifecycle - Avoid unnecessary repeated creation
  • Make good use of Chat Session - Native conversation memory is very powerful
  • Invest in Prompt Optimization - Small changes bring big improvements

Google Search Grounding is definitely worth a try!

Test Steps

1. Start the Application

# Confirm that the environment variables have been set
export GOOGLE_CLOUD_PROJECT=your-project-id

# Restart the application
uvicorn main:app --reload

Enter fullscreen mode Exit fullscreen mode

2. Test Basic Functions

Test in LINE:

Send: What is Python?
Expected: ✅ Receive detailed answer + source

Send: What are its advantages?
Expected: ✅ See "💬 [In conversation]" mark, Bot knows "it" = Python

Send: /status
Expected: ✅ Display conversation status

Send: /clear
Expected: ✅ Display "Conversation has been reset"

Enter fullscreen mode Exit fullscreen mode

3. Check the Logs

Should see:

INFO:main:Text search using Vertex AI Grounding with Google Search
INFO:loader.chat_session:Creating new session for user ...
INFO:loader.chat_session:Sending message to Grounding API ...

Enter fullscreen mode Exit fullscreen mode

Should not see:

ERROR:loader.chat_session:Grounding search failed: Cannot send a request, as the client has been closed.

Enter fullscreen mode Exit fullscreen mode

Related Documents

Detailed technical documentation in the project:

  • TEXT_SEARCH_IMPROVEMENT.md - Complete solution analysis and comparison
  • GROUNDING_IMPLEMENTATION.md - Implementation guide and acceptance checklist
  • CLIENT_CLOSED_FIX.md - Client lifecycle error repair
  • SEARCHTOOL_CLEANUP.md - Code cleanup summary

References

Top comments (0)