DEV Community

Cover image for Build a Practical AI Agent with Gemma 4, Real Tools, and a Local LLM
Natarajan Murugesan
Natarajan Murugesan

Posted on

Build a Practical AI Agent with Gemma 4, Real Tools, and a Local LLM

Build a Practical AI Agent with Real Tools and a Local LLM

How to combine Tavily, OpenWeatherMap, and LangGraph into a clean local-first workflow

Most local LLM demos are either too simple to be useful or too complex to reuse.

I wanted something in the middle: a practical AI agent that can search the web, check live weather, and still keep the reasoning layer local. So I built a workflow using LangGraph, Tavily, OpenWeatherMap, and a local LLM running through Ollama.

The result is a clean local-first architecture where tools provide real-time facts and the model turns them into useful answers.

Why this matters

Local LLMs are great for privacy, experimentation, and cost control. But on their own, they do not know live weather or fresh web results.

That is where external tools become valuable.

In this setup:

  • Tavily provides current web search results
  • OpenWeatherMap provides live weather data
  • the local LLM handles reasoning and phrasing
  • LangGraph orchestrates the workflow

This combination creates something much more useful than a plain local chatbot.

Architecture at a Glance

Before looking at the code, here is the core workflow.

Rather than asking the model to do everything in one opaque step, I split the system into clear responsibilities:

  • a router node decides what kind of request it is
  • a Tavily node fetches fresh web results
  • an OpenWeatherMap node fetches live weather data
  • a local LLM node combines those results into the final answer

That makes the system easier to debug, easier to extend, and much easier to reason about.

The key design principle is simple: tools provide facts, the local LLM provides reasoning.

Why this architecture works

This pattern works because it separates responsibility cleanly.

The model is no longer expected to hallucinate current weather or pretend it knows the latest web information. Instead, tools provide live facts and the LLM focuses on synthesis, reasoning, and response generation.

That separation makes the system:

  • more reliable
  • easier to debug
  • easier to evolve over time
  • more practical for real applications

Example use cases

This kind of agent can answer questions like:

  • “What is LangGraph and why is it useful?”
  • “What is the weather in Amsterdam today?”
  • “Summarize today’s AI news and tell me whether I need a jacket this evening.”
  • “Give me a quick weather summary and travel advice for Rotterdam.”

That is where the stack becomes more interesting than a typical chatbot.

State schema

A simple state schema can look like this:

from typing_extensions import TypedDict

class AgentState(TypedDict):
    user_input: str
    intent: str
    city: str
    search_results: str
    weather_data: str
    response: str
Enter fullscreen mode Exit fullscreen mode

This keeps the workflow explicit and makes it easier to inspect what each node is contributing.

Router node

The router decides whether the request needs:

  • web search
  • weather lookup
  • both

Here is a simple keyword-based version:

def router_node(state: AgentState):
    text = state["user_input"].lower()

    has_weather = any(
        word in text for word in ["weather", "temperature", "rain", "forecast", "jacket"]
    )
    has_search = any(
        word in text for word in ["what is", "latest", "news", "search", "explain", "why"]
    )

    if has_weather and has_search:
        return {"intent": "both"}
    elif has_weather:
        return {"intent": "weather"}
    else:
        return {"intent": "search"}
Enter fullscreen mode Exit fullscreen mode

This is intentionally simple. A more advanced version could use an LLM-based classifier or structured intent routing.

Tavily search node

Tavily is a clean option for search in AI workflows because it returns structured results that are easy to pass into the next node.

from tavily import TavilyClient
import os

tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

def tavily_search_node(state: AgentState):
    result = tavily.search(
        query=state["user_input"],
        topic="general",
        max_results=3,
    )

    lines = []
    for item in result.get("results", []):
        title = item.get("title", "")
        url = item.get("url", "")
        snippet = item.get("content", "")
        lines.append(f"Title: {title}\nURL: {url}\nSnippet: {snippet}")

    return {"search_results": "\n\n".join(lines)}
Enter fullscreen mode Exit fullscreen mode

This node is responsible only for search. It does not try to answer the question yet.

OpenWeatherMap node

The OpenWeatherMap node handles live weather data. In this version, the city is resolved to coordinates first, and then current weather is fetched.

import os
import requests

OPENWEATHER_API_KEY = os.environ["OPENWEATHER_API_KEY"]

def get_coordinates(city: str):
    response = requests.get(
        "http://api.openweathermap.org/geo/1.0/direct",
        params={
            "q": city,
            "limit": 1,
            "appid": OPENWEATHER_API_KEY,
        },
        timeout=20,
    )
    response.raise_for_status()
    data = response.json()

    if not data:
        raise ValueError(f"No coordinates found for city: {city}")

    return data[0]["lat"], data[0]["lon"], data[0]["name"], data[0].get("country", "")

def openweather_node(state: AgentState):
    city = state["city"] or "Amsterdam,NL"
    lat, lon, name, country = get_coordinates(city)

    weather = requests.get(
        "https://api.openweathermap.org/data/2.5/weather",
        params={
            "lat": lat,
            "lon": lon,
            "appid": OPENWEATHER_API_KEY,
            "units": "metric",
        },
        timeout=20,
    ).json()

    summary = f'''
Location: {name}, {country}
Temperature: {weather["main"]["temp"]} °C
Feels like: {weather["main"]["feels_like"]} °C
Condition: {weather["weather"][0]["description"]}
Humidity: {weather["main"]["humidity"]}%
Wind speed: {weather["wind"]["speed"]} m/s
'''.strip()

    return {"weather_data": summary}
Enter fullscreen mode Exit fullscreen mode

This node only retrieves and formats facts. It does not do interpretation.

Local LLM answer node

Once the tool results are available, the local LLM can turn them into a useful final answer.

from langchain_ollama import ChatOllama

llm = ChatOllama(
    model="gemma4:e4b",  # replace with your local gemma4 tag if available
    temperature=0,
)

def answer_node(state: AgentState):
    prompt = f'''
You are a helpful AI assistant.

Use the available tool results below to answer the user's request clearly.

User request:
{state["user_input"]}

Web search results:
{state["search_results"]}

Weather data:
{state["weather_data"]}
'''

    result = llm.invoke(prompt)
    return {"response": result.content}
Enter fullscreen mode Exit fullscreen mode

This is the point where the local model adds value. It is not fetching data. It is turning tool outputs into a coherent response.

Build the LangGraph workflow

Here is how the graph can be wired together:

from langgraph.graph import StateGraph, START, END

graph = StateGraph(AgentState)

graph.add_node("router", router_node)
graph.add_node("tavily_search", tavily_search_node)
graph.add_node("openweather", openweather_node)
graph.add_node("answer", answer_node)

graph.add_edge(START, "router")

def route_after_router(state: AgentState):
    return state["intent"]

graph.add_conditional_edges(
    "router",
    route_after_router,
    {
        "search": "tavily_search",
        "weather": "openweather",
        "both": "tavily_search",
    },
)

graph.add_edge("tavily_search", "openweather")
graph.add_edge("openweather", "answer")
graph.add_edge("answer", END)

app = graph.compile()
Enter fullscreen mode Exit fullscreen mode

This version keeps the orchestration explicit. You can easily expand it later with retries, memory, more tools, or better routing.

Run the agent

result = app.invoke({
    "user_input": "Summarize today's AI news and tell me if I need a jacket in Amsterdam",
    "intent": "",
    "city": "Amsterdam,NL",
    "search_results": "",
    "weather_data": "",
    "response": "",
})

print(result["response"])
Enter fullscreen mode Exit fullscreen mode

Example output

For a weather-related prompt, the agent produced this response:

The weather in Amsterdam today is expected to be a clear sky.

  • Temperature: 9.22 °C
  • Feels like: 4.85 °C
  • Wind speed: 11.62 m/s
  • Humidity: 71%

Should you carry a jacket?

Yes. The wind makes it feel significantly colder than the actual temperature, so a jacket is definitely recommended.

This is exactly the kind of output I was aiming for: not just raw tool data, but a useful, human-readable answer grounded in live information.

At that point, the agent can combine fresh web information, live weather data, and local reasoning in one workflow.

Why I like this pattern

For me, this is where local LLMs become much more interesting.

Not as isolated chatbots.
Not as black-box “agents.”
But as reasoning layers connected to real tools.

This pattern gives me:

  • local-first reasoning
  • explicit orchestration
  • practical utility
  • easier debugging
  • a strong path for extension

That makes it useful both as a learning project and as a foundation for more advanced AI applications.

Final thoughts

This project reminded me that useful AI agents do not need to be huge or mysterious.

A small graph, a couple of well-chosen tools, and a local LLM are enough to build something genuinely practical.

LangGraph gives the structure. Tavily and OpenWeatherMap provide live facts. The local model turns those facts into useful answers.

That feels like a strong foundation for local-first AI systems.

Top comments (0)