<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ursula Kümpel</title>
    <description>The latest articles on DEV Community by Ursula Kümpel (@ursula_kmpel_a5f788a283c).</description>
    <link>https://dev.to/ursula_kmpel_a5f788a283c</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3067585%2F4448784b-6897-46ba-90c6-83e5feb69ba5.png</url>
      <title>DEV Community: Ursula Kümpel</title>
      <link>https://dev.to/ursula_kmpel_a5f788a283c</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ursula_kmpel_a5f788a283c"/>
    <language>en</language>
    <item>
      <title>Building a Conversational Sightseeing Agent with LangGraph, Gemini, and Google Maps</title>
      <dc:creator>Ursula Kümpel</dc:creator>
      <pubDate>Sun, 20 Apr 2025 15:36:34 +0000</pubDate>
      <link>https://dev.to/ursula_kmpel_a5f788a283c/building-a-conversational-sightseeing-agent-with-langgraph-gemini-and-google-maps-ede</link>
      <guid>https://dev.to/ursula_kmpel_a5f788a283c/building-a-conversational-sightseeing-agent-with-langgraph-gemini-and-google-maps-ede</guid>
      <description>&lt;p&gt;Planning a trip to a new city can be exciting, but often overwhelming. Traditional guidebooks offer static information, and online searches can drown you in generic recommendations. What if you had a personal travel assistant you could chat with, one that understands your interests, helps you build a custom itinerary, remembers your preferences, and even provides directions?&lt;/p&gt;

&lt;p&gt;That's exactly the idea behind the Sightseeing Agent I developed for my Capstone project. The goal was to move beyond static travel planning and create a dynamic, personalized experience using the power of Generative AI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Problem: Static Plans vs. Dynamic Exploration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Standard travel planning often involves:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Generic Information:&lt;/strong&gt; Guidebooks or websites list popular spots but don't tailor suggestions to your specific interests (e.g., "I love baroque architecture" or "I need wheelchair-accessible options").&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual Itinerary Building:&lt;/strong&gt; Juggling maps, opening hours, and personal preferences across different sources is tedious.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lack of Adaptability:&lt;/strong&gt; Plans are often rigid. What if a place is unexpectedly closed, or you suddenly feel like visiting a park instead of a museum? Static plans don't adapt easily.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Memory:&lt;/strong&gt; Standard tools don't remember your preferences or the conversation context from one planning session (or even one query) to the next.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;The Solution: A Conversational AI Agent&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I aimed to solve these problems by building an AI agent that acts as an interactive sightseeing planner, specifically for Erlangen, Germany (though the concept is adaptable). Users can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chat Naturally:&lt;/strong&gt; Ask questions about available attractions, history, or interests in plain English.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get Personalized Suggestions:&lt;/strong&gt; The agent leverages Google's Gemini LLM to understand requests and can use tools to fetch relevant, up-to-date information.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build an Itinerary:&lt;/strong&gt; Users can ask the agent to add specific places to their plan.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Specify Preferences:&lt;/strong&gt; Tell the agent about interests (history, parks, etc.) or needs (accessibility), which it remembers for future suggestions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review and Finalize:&lt;/strong&gt; The agent helps confirm the plan before finishing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get Directions:&lt;/strong&gt; Once the plan is finalized, the agent uses the Google Maps Directions API to provide walking directions between the chosen locations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How Gen AI Makes It Happen: The Tech Stack&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent is built using Python and relies on a few key technologies:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Google Gemini:&lt;/strong&gt; The core Large Language Model (LLM) provides the natural language understanding and generation capabilities, allowing for fluid conversation. It also acts as the "brain" deciding when to call external tools.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LangGraph:&lt;/strong&gt; This library from LangChain is crucial for building reliable, stateful AI applications. It allows us to define the agent's workflow as a graph, explicitly managing the flow of information and actions between different steps (nodes).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LangChain:&lt;/strong&gt; Provides the foundational components, including tool definitions, message types, and LLM integrations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google Maps APIs (Places &amp;amp; Directions):&lt;/strong&gt; External tools connected to the agent provide real-world data:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Places API:&lt;/strong&gt; Fetches details about tourist attractions (summary, opening hours, accessibility hints).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Directions API:&lt;/strong&gt; Calculates routes between locations in the finalized itinerary.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Implementation Highlights: State, Tools, and Graphs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let's look at a few code concepts that make this work:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Managing State with TypedDict and MemorySaver:
&lt;/h3&gt;

&lt;p&gt;The agent needs to remember things. LangGraph manages this through a state object. We define its structure using Python's TypedDict and use LangGraph's built-in message handling and a checkpointer for persistence.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from typing import Annotated, List, Any, Dict
from typing_extensions import TypedDict
from langgraph.graph.message import add_messages
from langchain_core.messages import BaseMessage
from langgraph.checkpoint.memory import MemorySaver # For persistence

class TravelPlanState(TypedDict):
    """State representing the user's sightseeing plan conversation."""
    # Conversation history (appended automatically)
    messages: Annotated[list[BaseMessage], add_messages]
    # User's plan details
    itinerary: list[str]
    preferences: Dict[str, Any]
    # Flag to control workflow end
    finished: bool

# Checkpointer setup (done during graph compilation)
# memory = MemorySaver()
# graph = graph_builder.compile(checkpointer=memory)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The MemorySaver allows the agent to pick up a conversation where it left off using a unique thread_id, remembering the itinerary, preferences, and messages.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Defining Tools with @tool:
&lt;/h3&gt;

&lt;p&gt;We give the LLM capabilities by defining Python functions as tools. The @tool decorator helps LangChain understand the function's purpose (from its docstring) and arguments.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from langchain_core.tools import tool

# Example Tool: Getting available places (stateless info retrieval)
@tool
def get_available_places() -&amp;gt; str:
    """
    Retrieves a list of tourist attractions 
    in Erlangen from Google Maps Places API...
    Returns the list as a JSON string.
    """
    # ... (Code using googlemaps client to call Places API) ...
    # Returns JSON string pass 
    # Implementation omitted for brevity

# Example Tool: Adding to itinerary (modifies state)
@tool
def add_place_to_itinerary(place: str) -&amp;gt; str:
    """Adds the specified place to the user's itinerary."""
    # NOTE: The actual state update happens in 'plan_node',
    # this definition is just for the LLM to know the tool exists.
    return f"Placeholder: Added {place}"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Orchestrating with LangGraph Nodes:
&lt;/h3&gt;

&lt;p&gt;The workflow is a graph where nodes perform actions.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;agent&lt;/strong&gt; node calls the LLM (llm_with_tools, which knows about the defined tools).&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;route_agent&lt;/strong&gt; node looks at the agent's output. If the agent decided to call get_available_places, the router directs flow to a built-in ToolNode. If it called add_place_to_itinerary, it routes to our custom plan_node. If the plan is finished, it routes to get_directions_node or END. Otherwise, it waits for user_input.&lt;/li&gt;
&lt;li&gt;The custom &lt;strong&gt;plan_node&lt;/strong&gt; safely executes the state-changing logic (like appending to the itinerary list) based on the tool call requested by the agent.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;get_directions_node&lt;/strong&gt; runs only after finalization to call the Directions API.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Simplified conceptual flow in the graph builder:

graph_builder = StateGraph(TravelPlanState)


# Add nodes for agent, user input, executing tools, updating plan, getting directions
graph_builder.add_node("agent", sightseeing_agent_with_tools)
graph_builder.add_node("user_input", user_input_node)
graph_builder.add_node("tools", ToolNode(info_retrieval_tools)) # Executes get_available_places
graph_builder.add_node("plan", plan_node)                  # Executes add_place_to_itinerary etc.
graph_builder.add_node("directions", get_directions_node)  # Executes Directions API call

# Define entry point and edges (some conditional based on router output)
graph_builder.add_edge(START, "agent")
graph_builder.add_conditional_edges("agent", route_agent, {...}) # Routes to tools, plan, user_input, directions, or END
graph_builder.add_conditional_edges("user_input", should_exit, {...}) # Routes to agent or END
graph_builder.add_edge("tools", "agent") # Return to agent after tool execution
graph_builder.add_edge("plan", "agent")  # Return to agent after plan update
graph_builder.add_edge("directions", END) # End after showing directions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structured approach ensures reliability and makes the agent's behaviour predictable and debuggable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitations and Challenges&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;While powerful, this approach isn't without limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LLM Imperfections:&lt;/strong&gt; Gemini, like any LLM, might occasionally misunderstand a nuanced request, misinterpret context, or fail to call the correct tool. More sophisticated prompting or fine-tuning could help but adds complexity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Reliance &amp;amp; Costs:&lt;/strong&gt; The agent heavily relies on external APIs (Gemini, Maps). This incurs costs and is dependent on API availability and quotas. Error handling for API failures is essential.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context Window:&lt;/strong&gt; While we use windowed memory, very long conversations might still lose early context not captured in the preferences or itinerary.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-Interactive Finalization:&lt;/strong&gt; The current simulation uses input() within the finalize_plan logic inside plan_node. This works interactively but fails in the non-interactive test setup. A real deployment would need a proper UI callback mechanism.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Visuals:&lt;/strong&gt; The agent currently only describes places and directions textually; it doesn't display maps or images directly in the chat.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Future Possibilities: The Art of the Possible&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This Capstone project lays a foundation. Exciting future enhancements could include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multimodality:&lt;/strong&gt; Allow users to upload images of places they like or receive map snapshots/photos directly in the chat.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Booking Integration:&lt;/strong&gt; Connect to hotel or event booking APIs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proactive Suggestions:&lt;/strong&gt; Have the agent suggest activities based on the time of day, weather (via another tool!), or user location.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deeper Preference Learning:&lt;/strong&gt; Store and analyze user preferences over multiple trips.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Voice Interaction:&lt;/strong&gt; Enable users to talk to the agent instead of typing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enhanced Error Handling:&lt;/strong&gt; Implement more robust loops for clarifying ambiguous user requests or handling tool errors gracefully.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Building this Sightseeing Agent demonstrated the incredible potential of combining large language models like Gemini with structured workflow tools like LangGraph and real-world data from external APIs. By managing state effectively and defining clear operational steps, we can create AI applications that move beyond simple Q&amp;amp;A to become genuinely helpful, personalized, and dynamic assistants for complex tasks like travel planning. While challenges remain, the possibilities for more intuitive and intelligent interactions are vast.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
