Tim Kelly for MongoDB

Posted on Oct 16

Build Your First AI Agent With MongoDB and LangChain4j

#mongodb #ai #agents #java

AI agents are everywhere right now. We've all heard the pitch: They reason about problems, use tools autonomously, and chain together multiple steps to accomplish goals without constant hand-holding. They're being deployed to book flights, analyze data pipelines, and handle customer support—taking on tasks that previously required either rigid automation or human intervention.

This tutorial sits as a nice introduction to building your first AI agent with LangChain4j. We'll create a movie recommendation agent that:

Understands natural language descriptions of plots ("a sci-fi movie about rebels fighting an empire").
Searches semantically through a database using vector embeddings.
Calls external APIs to fetch real-time streaming availability.
Orchestrates these steps autonomously based on the query.
Returns clean, conversational answers instead of raw JSON.

This agent uses MongoDB Atlas for vector search, LangChain4j's agentic framework for orchestration, and the Watchmode API for streaming data. By the end, you'll understand not just how to wire up these components, but how agentic systems actually work: how LLMs plan multi-step workflows, how tools share state, and how to give your agent instructions without hardcoding every possible scenario.

Vector search is what will allow us to search our data with natural language, and if you'd like to learn more and earn a skills badge, check out our Vector Search Fundamentals skills badge.

If you just want the code, it's available in this GitHub repo.

What is an AI agent?

An AI agent is a system that can take in information about its environment, decide what to do next, and work towards a goal, with minimum human intervention. Instead of behaving like a static chatbot that only answers questions, an agent can plan, use tools (like databases or APIs), and adapt based on results.

While the defining traits of an AI agent are the ability to reason and act, there’s no single paradigm for how an agent must function. One popular approach is the ReAct framework (Reasoning + Acting), where the agent:

Thinks about the problem.
Takes an action (e.g., querying a search engine).
Observes the result.
Repeats until it can deliver a complete answer.

In short:

Chatbot: Answers questions based on what it already knows
AI agent: Reasons about what it needs to learn, uses tools to gather that information, and works toward a goal

For our movie recommendation agent, we'll use a supervisor pattern, where a planning model orchestrates multiple specialized tools to find movies by plot description, look up streaming availability, and return a clean answer to the user.

What is LangChain4j?

LangChain4j is an open-source Java library designed to simplify building LLM-powered applications. It provides a unified interface for working with multiple LLM providers (OpenAI, Anthropic, etc.) and vector stores like MongoDB Atlas.

While the name invites comparison to the Python-based LangChain project, LangChain4j is more of a fusion, drawing inspiration from Haystack, LlamaIndex, and the wider AI community, all while staying laser-focused on the needs of Java developers.

Development kicked off in early 2023 during the ChatGPT boom, and while the project is still evolving, its core functionality is stable and production-ready. If you're exploring LLM-powered applications in Java, LangChain4j is a pragmatic and actively maintained option.

For this tutorial, we're specifically using the langchain4j-agentic module, which provides abstractions for building agentic systems, including the supervisor pattern that will orchestrate our movie search workflow.

Prerequisites

Before we get started, here's what you'll need:

Java 8 or later installed and ready to go (I'm using Java 24)
Maven for building the project (version 3.9.10 or later recommended)
A MongoDB Atlas cluster—a free M0 tier is perfect for this tutorial. If you need help setting one up, check out the Get Started with Atlas guide.
API keys:
- Voyage AI for generating embeddings (sign up here)
- You can use the OpenAI API for both the embeddings and chat model, if you prefer.
- OpenAI for the planning model (get your key here)
- Watchmode for streaming availability data (register here)

Make sure your IP address is whitelisted in your MongoDB Atlas cluster's network access settings, and create a database user with read/write permissions.

Our dataset

For this tutorial, we're using the IMDB Top 1000 Movies dataset from Kaggle. Download the CSV file and place it in your project's src/main/resources directory, naming it imdb_top_1000.csv.

This dataset includes movie titles, plot overviews, genres, directors, and IMDB ratings. We'll be embedding the overview field (the plot description) so users can search for movies semantically—e.g., "Find me a sci-fi movie about rebels fighting an empire"—instead of needing to remember the exact title.

Creating our app

Let's scaffold a new Maven project. Create a pom.xml file with the following dependencies:


    <dependencies>
        <!-- LangChain4j core -->
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j</artifactId>
            <version>1.5.0</version>
        </dependency>

        <!-- LangChain4j agentic module -->
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-agentic</artifactId>
            <version>1.5.0-beta11</version>
        </dependency>

        <!-- OpenAI integration for the planning model -->
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-open-ai</artifactId>
            <version>1.4.0</version>
        </dependency>

        <!-- MongoDB Atlas integration -->
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-mongodb-atlas</artifactId>
            <version>1.5.0-beta11</version>
        </dependency>

        <!-- Voyage AI for embeddings -->
        <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-voyage-ai</artifactId>
            <version>1.5.0-beta11</version>
        </dependency>

        <!-- MongoDB Java Driver -->
        <dependency>
            <groupId>org.mongodb</groupId>
            <artifactId>mongodb-driver-sync</artifactId>
            <version>5.5.1</version>
        </dependency>

        <!-- OpenCSV for parsing the IMDB dataset -->
        <dependency>
            <groupId>com.opencsv</groupId>
            <artifactId>opencsv</artifactId>
            <version>5.8</version>
        </dependency>

        <!-- Jackson for parsing JSON responses from Watchmode -->
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
            <version>2.17.2</version>
        </dependency>
    </dependencies>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>dev.langchain4j</groupId>
                <artifactId>langchain4j-bom</artifactId>
                <version>1.5.0-beta11</version>
                <type>pom</type>
            </dependency>
        </dependencies>
    </dependencyManagement>
</project>

These dependencies give us everything we need: the LangChain4j agentic framework, MongoDB and OpenAI integrations, embedding support via Voyage AI, and utilities for parsing CSV and JSON data.

Connecting to MongoDB

Now, let's create the main application class that will handle connecting to MongoDB, loading our movie data, and setting up our agent.

Create a file at MovieAgentApp.java:

package com.mongodb.movieagent;

import com.mongodb.client.*;
import dev.langchain4j.model.voyageai.VoyageAiEmbeddingModel;
import dev.langchain4j.store.embedding.mongodb.IndexMapping;
import dev.langchain4j.store.embedding.mongodb.MongoDbEmbeddingStore;
import org.bson.Document;

import java.util.HashSet;

public class MovieAgentApp {

    public static final String databaseName = "movie_search";
    public static final String collectionName = "movies";
    public static final String indexName = "vector_index";

    public static void main(String[] args) throws InterruptedException {
        String embeddingApiKey = System.getenv("VOYAGE_AI_KEY");
        String mongodbUri = System.getenv("MONGODB_URI");
        String watchmodeKey = System.getenv("WATCHMODE_KEY");
        String openAiKey = System.getenv("OPENAI_KEY");

        MongoClient mongoClient = MongoClients.create(mongodbUri);

        VoyageAiEmbeddingModel embeddingModel = VoyageAiEmbeddingModel.builder()
                .apiKey(embeddingApiKey)
                .modelName("voyage-3")
                .build();
    }
}

This sets up our connection to MongoDB using the connection string from our environment variables. We're also instantiating the Voyage AI embedding model, which we'll use to convert movie plot descriptions into vector embeddings.

The voyage-3 model generates 1024-dimensional embeddings, which we'll need to specify when creating our vector search index. To learn more about Voyage AI's models, check out their blog post about voyage-3.

Now, we will be using the Voyage AI model for generating our embeddings, and the separate OpenAI model for planning and orchestrating our AI agent. If you prefer, you can use OpenAI for both the embedding and planning. Just swap out your VoyageAiEmbeddingModel for the OpenAiEmbeddingModel. This does not go both ways, as Voyage AI is not supported as a chat model in LangChain4j.

Next, we'll configure our embedding store and automatically create a vector search index if one doesn't already exist:

        IndexMapping indexMapping = IndexMapping.builder()
                .dimension(embeddingModel.dimension())
                .metadataFieldNames(new HashSet<>())
                .build();

        MongoDbEmbeddingStore embeddingStore = MongoDbEmbeddingStore.builder()
                .databaseName(databaseName)
                .collectionName(collectionName)
                .createIndex(checkIndexExists(mongoClient))
                .indexName(indexName)
                .indexMapping(indexMapping)
                .fromClient(mongoClient)
                .build();

        if(checkDataExists(mongoClient)) {
            loadDataFromCSV(embeddingStore, embeddingModel);
        }

Let's break down what's happening here:

The IndexMapping tells LangChain4j how to configure the vector search index. We're setting the dimension to match our embedding model (1024 for voyage-3) and leaving metadataFieldNames empty since we don't need to filter on metadata fields for this example.

The MongoDbEmbeddingStore builder does several things:

Points to our movie_search.movies collection
Checks if an index already exists using our helper method checkIndexExists()
If no index exists, automatically creates one with the name vector_index
Uses our IndexMapping to define the index structure

The resulting vector search index looks like this:

{
  "fields": [
    {
      "type": "vector",
      "path": "embedding",
      "numDimensions": 1024,
      "similarity": "cosine"
    }
  ]
}

Finally, we check if the collection already has data using checkDataExists(), and if it's empty, we load and embed the movie dataset from our CSV file.

Now, let's add those helper methods at the bottom of the class:

    public static void loadDataFromCSV(
            MongoDbEmbeddingStore embeddingStore,
            VoyageAiEmbeddingModel embeddingModel
    ) throws InterruptedException {
        System.out.println("Loading data...");

        MovieEmbeddingService embeddingService = new MovieEmbeddingService(embeddingStore, embeddingModel);
        embeddingService.ingestMoviesFromCsv();

        System.out.println("Movie data loaded successfully!");
        System.out.println("Waiting 5 seconds for indexing to complete...");
        Thread.sleep(5000);
    }

    public static boolean checkDataExists(MongoClient mongoClient) {
        MongoCollection<Document> collection = mongoClient
            .getDatabase(databaseName)
            .getCollection(collectionName);
        return collection.find().first() == null;
    }

    public static boolean checkIndexExists(MongoClient mongoClient) {
        MongoCollection<Document> collection = mongoClient
            .getDatabase(databaseName)
            .getCollection(collectionName);

        try(MongoCursor<Document> indexes = collection.listIndexes().iterator()) {
            while (indexes.hasNext()) {
                Document index = indexes.next();
                if (indexName.equals(index.getString(indexName))) {
                    return false;
                }
            }
        }
        return true;
    }
}

The checkIndexExists() method iterates through all indexes on the collection and returns false if it finds one named vector_index, telling LangChain4j to skip index creation. The checkDataExists() method simply checks if the collection is empty.

The five-second sleep after loading data gives MongoDB Atlas time to build the vector search index. Atlas indexing is eventually consistent, so this wait ensures our index is queryable before we start searching. In production, you'd want more robust index readiness checking, but for a tutorial, a brief sleep does the job.

Importing our data

Now, we need to actually load the IMDB dataset, parse it, and convert the plot descriptions into vector embeddings. We'll create two classes: Movie to represent each row in the CSV, and MovieEmbeddingService to handle the embedding and storage logic.

Creating the Movie model

Create Movie.java:

package com.mongodb.movieagent;  

import com.opencsv.bean.CsvBindByPosition;  

public class Movie {  

    @CsvBindByPosition(position = 0)  
    private String posterLink;  

    @CsvBindByPosition(position = 1)  
    private String title;  

    @CsvBindByPosition(position = 2)  
    private String year;  

    @CsvBindByPosition(position = 3)  
    private String certificate;  

    @CsvBindByPosition(position = 4)  
    private String runtime;  

    @CsvBindByPosition(position = 5)  
    private String genre;  

    @CsvBindByPosition(position = 6)  
    private String imdbRating;  

    @CsvBindByPosition(position = 7)  
    private String overview;  

    @CsvBindByPosition(position = 8)  
    private String metaScore;  

    @CsvBindByPosition(position = 9)  
    private String director;  

    @CsvBindByPosition(position = 10)  
    private String star1;  

    @CsvBindByPosition(position = 11)  
    private String star2;  

    @CsvBindByPosition(position = 12)  
    private String star3;  

    @CsvBindByPosition(position = 13)  
    private String star4;  

    @CsvBindByPosition(position = 14)  
    private String numberOfVotes;  

    @CsvBindByPosition(position = 15)  
    private String gross;  

    public Movie() {}  

    public String getPosterLink() { return posterLink; }  
    public void setPosterLink(String posterLink) { this.posterLink = posterLink; }  

    public String getTitle() { return title; }  
    public void setTitle(String title) { this.title = title; }  

    public String getYear() { return year; }  
    public void setYear(String year) { this.year = year; }  

    public String getCertificate() { return certificate; }  
    public void setCertificate(String certificate) { this.certificate = certificate; }  

    public String getRuntime() { return runtime; }  
    public void setRuntime(String runtime) { this.runtime = runtime; }  

    public String getGenre() { return genre; }  
    public void setGenre(String genre) { this.genre = genre; }  

    public String getImdbRating() { return imdbRating; }  
    public void setImdbRating(String imdbRating) { this.imdbRating = imdbRating; }  

    public String getOverview() { return overview; }  
    public void setOverview(String overview) { this.overview = overview; }  

    public String getMetaScore() { return metaScore; }  
    public void setMetaScore(String metaScore) { this.metaScore = metaScore; }  

    public String getDirector() { return director; }  
    public void setDirector(String director) { this.director = director; }  

    public String getStar1() { return star1; }  
    public void setStar1(String star1) { this.star1 = star1; }  

    public String getStar2() { return star2; }  
    public void setStar2(String star2) { this.star2 = star2; }  

    public String getStar3() { return star3; }  
    public void setStar3(String star3) { this.star3 = star3; }  

    public String getStar4() { return star4; }  
    public void setStar4(String star4) { this.star4 = star4; }  

    public String getNumberOfVotes() { return numberOfVotes; }  
    public void setNumberOfVotes(String numberOfVotes) { this.numberOfVotes = numberOfVotes; }  

    public String getGross() { return gross; }  
    public void setGross(String gross) { this.gross = gross; }  

    @Override  
    public String toString() {  
        return String.format("%s (%s) - %s - Rating: %s\nGenre: %s | Director: %s\n%s",  
                title, year, genre, imdbRating, genre, director, overview);  
    }  
}

The @CsvBindByPosition annotations from OpenCSV map each CSV column to a field in our Movie class. The most important field for our purposes is overview, which contains the plot description we'll embed for semantic search.

Creating embeddings for vector search

Now, let's create the service that reads the CSV, generates embeddings, and stores them in MongoDB. Create MovieEmbeddingService.java:

package com.mongodb.movieagent;

import com.opencsv.bean.CsvToBeanBuilder;
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.document.Metadata;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.store.embedding.EmbeddingStore;

import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.*;

public class MovieEmbeddingService {

    private final EmbeddingStore<TextSegment> embeddingStore;
    private final EmbeddingModel embeddingModel;

    public MovieEmbeddingService(EmbeddingStore<TextSegment> embeddingStore, 
                                  EmbeddingModel embeddingModel) {
        this.embeddingStore = embeddingStore;
        this.embeddingModel = embeddingModel;
    }

    public void ingestMoviesFromCsv() {
        try (InputStream inputStream = getClass()
                .getClassLoader()
                .getResourceAsStream("imdb_top_1000.csv")) {

            if (inputStream == null) {
                throw new RuntimeException("imdb_top_1000.csv not found");
            }

            List<Movie> movies = new CsvToBeanBuilder<Movie>(new InputStreamReader(inputStream))
                    .withType(Movie.class)
                    .build()
                    .parse();

            System.out.println("Processing " + movies.size() + " movies...");

            for (Movie movie : movies) {
                if (movie.getTitle() == null || movie.getOverview() == null) {
                    continue;
                }

                Metadata metadata = getMetadata(movie);
                TextSegment segment = TextSegment.from(movie.getOverview(), metadata);
                Embedding embedding = embeddingModel.embed(segment).content();

                embeddingStore.add(embedding, segment);
                System.out.println("Stored: " + movie.getTitle());
            }
        } catch (Exception e) {
            System.err.println("Error processing CSV: " + e.getMessage());
            e.printStackTrace();
        }
    }

    private static Metadata getMetadata(Movie movie) {
        Map<String, Object> metadataMap = new HashMap<>();
        metadataMap.put("title", movie.getTitle());
        metadataMap.put("year", movie.getYear());
        metadataMap.put("genre", movie.getGenre());
        metadataMap.put("director", movie.getDirector());
        metadataMap.put("imdbRating", movie.getImdbRating());
        metadataMap.put("star1", movie.getStar1());
        metadataMap.put("star2", movie.getStar2());

        return new Metadata(metadataMap);
    }
}

Let's walk through what's happening here:

CSV parsing: We use OpenCSV's CsvToBeanBuilder to read imdb_top_1000.csv from our resources directory and parse it into a list of Movie objects.
Embedding generation: For each movie, we:

Extract the overview (plot description).
Create a TextSegment containing the overview text and metadata.
Pass the segment to embeddingModel.embed()(our Voyage AI model) to generate a 1024-dimensional vector.
Store both the embedding and the segment (with its metadata) in MongoDB.

Metadata storage: We attach key movie information as metadata (title, year, genre, etc.). This gets stored alongside the embedding in MongoDB, so when we search, we get back not just vectors but all the movie details we need.

The beauty of this approach is that we're only embedding the overview field. When a user searches for "rebels fighting an empire," the semantic search matches that against movie plot descriptions, not titles or actor names. Users can describe what they remember about a movie without needing to know its exact name.

If you want to learn more about how to use MongoDB with LangChain4j, check out our documentation.

Creating our agent

Now comes the fun part: building the agentic system. We'll create three specialized tools that our supervisor agent can use, then wire them together.

Understanding the architecture

Our agent follows the supervisor pattern, where a planning model (powered by GPT-4o-mini) decides which tools to call and in what order. Here's the flow:

User query: "Find me a sci-fi movie about rebels fighting an empire and tell me where to stream it in GB"
Supervisor analyzes the query and generates a plan
Tool 1 (MongoDB search): Searches for movies matching the plot description
Tool 2 (Watchmode search): Gets the Watchmode ID for that movie
Tool 3 (Watchmode sources): Fetches streaming availability for that ID and region
Supervisor synthesizes all the results into a clean, human-readable response

The supervisor doesn't hardcode this workflow—it figures out the steps based on the context we provide it. This is the key difference between a traditional workflow and an agentic system: The LLM is doing the orchestration.

Note: We use a different model for embedding and for our planning model because different models have different strengths. We could easily use ChatGPT for both embedding and chat/planning, but Voyage AI is not supported as a chat model.

Defining tools

In LangChain4j's agentic module, a tool is just a Java method annotated with @Agent. The annotation tells the supervisor what the tool does, and the framework handles the rest—automatically converting method parameters to tool inputs and outputs to shared state.

Let's create our three tools.

Tool 1: MongoDB Movie Search

Create MongoDbMovieSearchTool.java:

package com.mongodb.movieagent;

import dev.langchain4j.agentic.Agent;
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.service.V;
import dev.langchain4j.store.embedding.EmbeddingSearchRequest;
import dev.langchain4j.store.embedding.EmbeddingSearchResult;
import dev.langchain4j.store.embedding.mongodb.MongoDbEmbeddingStore;

public class MongoDbMovieSearchTool {

    private final MongoDbEmbeddingStore store;
    private final EmbeddingModel embeddingModel;

    public MongoDbMovieSearchTool(EmbeddingModel embeddingModel, 
                                   MongoDbEmbeddingStore store) {
        this.store = store;
        this.embeddingModel = embeddingModel;
    }

    @Agent(value = "Search movies in MongoDB by semantic description",
            outputName = "movieTitle")
    public String search(@V("query") String query) {
        try {
            Embedding queryEmbedding = embeddingModel.embed(query).content();

            EmbeddingSearchResult<TextSegment> result = store.search(
                    EmbeddingSearchRequest.builder()
                            .queryEmbedding(queryEmbedding)
                            .maxResults(1)
                            .build()
            );

            if (!result.matches().isEmpty()) {
                TextSegment doc = result.matches().getFirst().embedded();
                System.out.println(doc.toString());
                return doc.metadata().getString("title");
            }

            return "No matching movie found in MongoDB.";
        } catch (Exception e) {
            return "Error searching MongoDB: " + e.getMessage();
        }
    }
}

The @Agent annotation does two important things:

value: Describes what this tool does. The supervisor uses this description to decide when to call it.
outputName: Specifies that the result should be stored in a shared variable called "movieTitle", which other tools can access.

The @V("query") annotation on the parameter tells LangChain4j to look for a variable named "query" in the shared state (called the AgenticScope) and pass it to this method.

When the supervisor calls this tool with a plot description like "rebels fighting an empire," we:

Convert the query to an embedding.
Perform a vector search in MongoDB to find the most similar movie plot.
Return the movie title, which gets stored as "movieTitle" in the shared state.

Tool 2: Watchmode Search

Create WatchmodeSearchTool.java:

package com.mongodb.movieagent;

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import dev.langchain4j.agentic.Agent;
import dev.langchain4j.service.V;

import java.net.URI;
import java.net.URLEncoder;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.nio.charset.StandardCharsets;

public class WatchmodeSearchTool {

    private final HttpClient http = HttpClient.newHttpClient();
    private final String apiKey;

    public WatchmodeSearchTool(String apiKey) {
        this.apiKey = apiKey;
    }

    @Agent(value = "Find Watchmode ID for a given movie title", 
           outputName = "watchmodeId")
    public String getWatchmodeId(@V("title") String title) {
        try {
            String url = String.format(
                    "https://api.watchmode.com/v1/search/?apiKey=%s&search_field=name&search_value=%s&types=movie",
                    apiKey, URLEncoder.encode(title, StandardCharsets.UTF_8)
            );

            HttpRequest req = HttpRequest.newBuilder()
                .uri(URI.create(url))
                .GET()
                .build();
            HttpResponse<String> resp = http.send(req, HttpResponse.BodyHandlers.ofString());

            JsonNode root = new ObjectMapper().readTree(resp.body());
            JsonNode results = root.get("title_results");

            if (results != null && !results.isEmpty()) {
                int id = results.get(0).get("id").asInt();
                System.out.println("Watchmode ID: " + id);
                return String.valueOf(id);
            }

            return "No Watchmode ID found for " + title;

        } catch (Exception e) {
            return "Error retrieving ID: " + e.getMessage();
        }
    }
}

This tool takes a movie title (like "Star Wars") and queries the Watchmode API to get its internal ID. That ID is then stored as "watchmodeId" in the shared state, ready for the next tool to use.

Watchmode's API documentation has more details on what is available with the API, but the key thing to know is that their /search/ endpoint returns a list of potential matches, and we're grabbing the first one.

Tool 3: Watchmode Sources

Create WatchmodeSourcesTool.java:

package com.mongodb.movieagent;

import dev.langchain4j.agentic.Agent;
import dev.langchain4j.service.V;

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;

public class WatchmodeSourcesTool {

    private final HttpClient http = HttpClient.newHttpClient();
    private final String apiKey;

    public WatchmodeSourcesTool(String apiKey) {
        this.apiKey = apiKey;
    }

    @Agent(value = "Get streaming availability for a Watchmode ID",
            outputName = "streamingSources")
    public String getSources(@V("watchmodeId") String id, 
                             @V("region") String region) {
        try {
            String url = String.format(
                    "https://api.watchmode.com/v1/title/%s/sources/?apiKey=%s&regions=%s",
                    id, apiKey, region
            );

            HttpRequest req = HttpRequest.newBuilder()
                .uri(URI.create(url))
                .GET()
                .build();
            HttpResponse<String> resp = http.send(req, HttpResponse.BodyHandlers.ofString());

            return resp.body();

        } catch (Exception e) {
            return "Error retrieving sources: " + e.getMessage();
        }
    }
}

This tool takes the watchmodeId from the previous step and a region code (like "GB" for Great Britain) and fetches the streaming availability. It returns the raw JSON response, which the supervisor will interpret to give the user a clean answer.

Notice how this tool expects two inputs: watchmodeId (from the previous tool) and region (from the original user query). The supervisor figures out which variables to pass to each tool based on their parameter names.

Creating the Supervisor

Now, let's wire everything together. First, create a simple interface for the supervisor:

MovieSupervisor.java:

package com.mongodb.movieagent;

import dev.langchain4j.agentic.Agent;
import dev.langchain4j.service.V;

public interface MovieSupervisor {
    @Agent
    String invoke(@V("request") String request);
}

This interface defines the entry point for our agent. Users will call invoke() with a natural language query, and the supervisor will orchestrate the tools to produce an answer.

Back in our MovieAgentApp.java, let's instantiate the planning model and build the supervisor (add this after the embedding store setup):

ChatModel planningModel = OpenAiChatModel.builder()
        .apiKey(openAiKey)
        .modelName("gpt-4o-mini")
        .build();

MongoDbMovieSearchTool mongoSearch =
        new MongoDbMovieSearchTool(embeddingModel, embeddingStore);
WatchmodeSearchTool watchmodeSearch = new WatchmodeSearchTool(watchmodeKey);
WatchmodeSourcesTool watchmodeSources = new WatchmodeSourcesTool(watchmodeKey);

MovieSupervisor supervisor = AgenticServices
        .supervisorBuilder(MovieSupervisor.class)
        .subAgents(mongoSearch, watchmodeSearch, watchmodeSources)
        .supervisorContext("""
            You are a movie assistant.
            1. If the user gives a plot/description, call MongoDbMovieSearchTool to find the movie.
            2. Use WatchmodeSearchTool with the title to get a Watchmode ID.
            3. Use WatchmodeSourcesTool with that ID and the user's region to get streaming info.
            4. Return a clean human-readable response.
            """)
        .chatModel(planningModel)
        .responseStrategy(SupervisorResponseStrategy.SUMMARY)
        .build();

Let's break down what's happening in this supervisor configuration:

The planning model: We're using gpt-4o-mini as our planning model. This is the "brain" that decides which tools to call and when. It's cheaper and faster than GPT-4, but still capable enough to handle multi-step reasoning.

Sub-agents: We register our three tools with the supervisor. Each tool's @Agent annotation provides a description that the planning model uses to decide when to invoke it.

Supervisor context: This is where things get interesting. The context is essentially a set of instructions for the planning model, telling it how to orchestrate the tools. It's similar to a system prompt in a regular chat application, but here it's guiding planning rather than just responding.

Think of the supervisor context as a recipe: "If the user describes a plot, search MongoDB first. Then use that title to get a Watchmode ID. Finally, fetch streaming sources." The LLM follows these steps dynamically, adapting based on what it finds. We can see how this would allow the planner to orchestrate things if the user didn't want to get where everything streams (only do steps 1, 2, and 4) or already had the movie title (skip step 1).

Response strategy: We're using SupervisorResponseStrategy.SUMMARY, which tells the supervisor to return a synthesized summary of all the tool calls rather than just the last tool's raw output. This means instead of getting back raw JSON from Watchmode, we get a clean answer like "Star Wars is available on Disney+ in GB."

The alternative strategies are:

LAST: Returns the output of the last tool called (useful when the final tool produces the answer).
SCORED: Uses another LLM call to score both the summary and the last tool output, returning whichever is better.

For this use case, SUMMARY makes sense because we want the agent to interpret the streaming data and present it nicely to the user.

Testing our movie recommendations

Now, let's put it all together and test our agent. Add this to the end of your main() method in MovieAgentApp.java:

        String query = "Find me a sci-fi movie about rebels fighting an empire in space and tell me where to stream it in GB.";
        String result = supervisor.invoke(query);

        System.out.println("Agent Response:\n" + result);
    }
}

Run your application with:

MONGODB_URI=$MONGODB_URI \
VOYAGE_AI_KEY=$VOYAGE_AI_KEY \
WATCHMODE_KEY=$WATCHMODE_KEY \
OPENAI_KEY=$OPENAI_API_KEY \
mvn compile exec:java -Dexec.mainClass=com.mongodb.movieagent.MovieAgentApp

On first run, you'll see output like:

Loading data...
Processing 1000 movies...
Stored: The Shawshank Redemption
Stored: The Godfather
...
Movie data loaded successfully!
Waiting 5 seconds for indexing to complete...

Then the agent will process your query. You'll see trace output showing which tools are being called:

Watchmode ID: 12345
Agent Response:
Star Wars: Episode IV - A New Hope is available to stream on Disney+ in Great Britain. 
You can also rent it on Amazon Prime Video for £3.49.

What just happened?

The supervisor analyzed your query and identified that you were describing a plot.
It called MongoDbMovieSearchTool, embedding "rebels fighting an empire in space" and searching MongoDB.
MongoDB returned "Star Wars: Episode IV - A New Hope" as the best match.
The supervisor then called WatchmodeSearchTool with that title to get the Watchmode ID.
Finally, it called WatchmodeSourcesTool with the ID and region "GB."
The planning model synthesized all three tool outputs into a clean, conversational response.

Try other queries to see how the agent handles different scenarios:

// Search by genre and mood
String query = "I want a dark crime thriller about an undercover cop, where can I watch it in the US?";

// Search by vague description
String query = "That movie with the kid who sees dead people, streaming options in US?";

// Mix of specifics
String query = "Christopher Nolan movie about dreams within dreams, available in GB?";

The beauty of this approach is that you never hardcoded the workflow. The supervisor figures out the steps based on your context instructions. If you wanted to add a fourth tool (say, fetching movie reviews), you'd just add it to the subAgents() list and update the supervisor context—no need to rewrite orchestration logic.

Understanding the AgenticScope

Behind the scenes, LangChain4j manages something called the AgenticScope, which is essentially a shared key-value store for the conversation. When MongoDbMovieSearchTool returns a title, it's stored as "movieTitle" in this scope. When WatchmodeSearchTool needs a title, it reads from "movieTitle". The supervisor handles all this variable passing automatically based on the @V annotations you defined on each tool's parameters.

If you wanted to inspect or debug what's in the scope at any point, you could implement the AgenticScopeAccess interface on your supervisor, but for this tutorial, the automatic variable passing is all you need.

Conclusion

We've built a genuine AI agent that can understand natural language movie queries, search semantically through plot descriptions, and fetch real-time streaming availability—all without hardcoding a single workflow step.

By combining MongoDB Atlas for vector search with LangChain4j's agentic framework, we created a system that:

Understands vague descriptions like "rebels fighting an empire" and maps them to actual movies.
Autonomously decides which tools to call and in what order.
Handles multi-step reasoning (search → get ID → fetch sources).
Returns clean, conversational responses instead of raw API data.

The supervisor pattern we used here is just one of many agentic architectures. You could extend this with additional tools (fetching reviews, checking rental prices, filtering by age rating), add memory so the agent remembers past conversations, or even create tool-specific agents that the supervisor coordinates.

If you're curious about building more complex agentic systems, check out the LangChain4j agentic documentation and explore patterns like loops, conditionals, and human-in-the-loop workflows.

For more MongoDB and AI tutorials, see my other articles on Building RAG Applications with Spring AI and MongoDB or dive into the MongoDB Atlas Vector Search documentation.

DEV Community