djibril mugisho

Posted on Sep 27

Getting Started with AI Agent Development with LangChain & LangGraph: Build an Autonomous Starbucks Agent

#ai #langchain #mongodb #node

Back in 2023, when I started using ChatGPT, it was just another chatbot that I could ask fancy questions and tell me what was wrong in my code snippets.

Everything was normal; the application had no memory of previous states or what you said yesterday. Then in 2024, everything started to change. We went from a stateless ChatBot to an AI agent that can call tools, search the internet, and generate download links. At this point, I started to get curious. How can an LLM search the internet? An infinite number of questions were flowing through my head. Can it create its own tools, programs, or execute its own code? It was clear that we were heading to the Skynet net revolution.

I was just ignorant 😅, and that's when I started my research and discovered Langchain, a tool that promises all those miracles without a billion-dollar budget.

So, what is an LLM agent

By definition, an LLM agent agent is a software program capable of perceiving its environment, making decisions, and taking autonomous actions to achieve specific goals, often by interacting with tools and systems. Many rules and conventions were created to achieve this, and one of the most famous and most used is the ReAct (Reason & act) framework.

With this framework, the LLM receives a prompt → think → decide the next action(This can be calling a specific tool ) → receive the tool data, once the tool’s response has been received, the AI model observes the response, generates the response, and plans next actions based on the tool response.

You can read more about this concept on the official white paper.

Please note that the workflow is not limited to a single tool invocation; it can proceed through several rounds before returning to the user.

However, for an LLM agent to be truly human-like and act with knowledge of the past, it requires a memory, enabling it to recall previous prompts and responses, thereby maintaining consistency within the given thread. There is not a single source of truth for how to approach this. Most agents implement a short-term memory; this means the agent will append each new chat to the conversation history, and when a new prompt is submitted, the agent will append the previous messages to the new prompt. This method is very efficient and gives the LLM a strong knowledge of previous states. But it can also introduce problems, because the more the conversation grows, the more the LLM will have to go through all previous messages in order to understand what action to take next.

You don’t have to implement this from scratch 😅, many tools and frameworks have been developed to make the implementation as easy as possible. Though nothing stops you from building everything from scratch, what we shall not do in this article 😁.

In this article, we will build a Starbucks barista that collects order information and calls the create_order tool once the order meets the full criteria. A tool that we shall create and expose to the AI.

Let’s start by initializing our project. In this article, we shall use Nest.js due to its efficiency and native Typescript support. Please note that nothing here is tied to Nest.js; this is just a framework preference, and everything I’m doing here can be done with plain Node.js or Express.js

Start by initializing your Nest.js project and install all required dependecies

$ npm i -g @nestjs/cli //If you don't have Nest.js installed on your machine
$ nest new project-name 

"dependencies" : {
  "@langchain/community": "^0.3.53",
    "@langchain/core": "^0.3.75",
    "@langchain/google-genai": "^0.2.16",
    "@langchain/langgraph": "^0.4.8",
    "@langchain/langgraph-checkpoint-mongodb": "^0.1.1",
    "@langchain/mongodb": "^0.1.0",
    "@nestjs/mongoose": "^11.0.3",
    "langchain": "^0.3.33",
    "mongodb": "^6.19.0",
    "mongoose": "^8.18.1",
    "zod": "^4.1.8"
}

//The versions may not be same at the time you are reading this, i recommand checking
//The official documentation for each package.

Now that we have our project created and all the packages installed, let’s see what we need in order for us to turn our vision into a project. Think of what you will need in order to create a Starbucks barista.

First we need to define the structure of our data(creating schemas)
Create a menu list that our agent will be refering to.
Add LLM interaction
And last but not least, the ability to save previous conversations for conversational context.

Folder structure

You can modify this folder structure and adapt it based on your framework of choice. But the core implementation is the same across all frameworks.

lib/util/schemas/drinks.ts

This file contains all our schema definitions regarding drinks and all modifications they can receive.

// Imports the 'z' object from the 'zod' library.
// Zod is a TypeScript-first schema declaration and validation library.
// 'z' is the primary object used to define schemas (e.g., z.object, z.string, z.boolean, z.array).
import z from 'zod';

// Imports the 'StructuredOutputParser' from 'langchain/output_parsers'.
// THIS IS CRUCIAL FOR AI INTEGRATION:
// Large Language Models (LLMs) DO NOT inherently understand TypeScript types or Zod schemas.
// They operate on text. The StructuredOutputParser bridges this gap by providing
// a mechanism to instruct the LLM on the *textual format* it should produce,
// and then to parse that textual output back into a type-safe TypeScript object
// defined by our Zod schemas.
import { StructuredOutputParser } from 'langchain/output_parsers';

/**
 * @description Zod schema defining the structure of a single drink item.
 * This schema specifies the required properties and their types for a beverage.
 */
export const DrinkSchema = z.object({
  /**
   * @description The name of the drink (e.g., "Espresso", "Latte").
   * It's a required string.
   */
  name: z.string(),
  /**
   * @description A brief description of the drink, explaining its characteristics.
   * It's a required string.
   */
  description: z.string(),
  /**
   * @description Indicates whether this drink supports different milk options (e.g., whole, almond, oat).
   * It's a required boolean.
   */
  supportMilk: z.boolean(),
  /**
   * @description Indicates whether this drink supports various sweetener options.
   * It's a required boolean.
   */
  supportSweeteners: z.boolean(),
  /**
   * @description Indicates whether this drink supports adding different syrup flavors.
   * It's a required boolean.
   */
  supportSyrup: z.boolean(),
  /**
   * @description Indicates whether this drink supports various toppings (e.g., whipped cream, chocolate shavings).
   * It's a required boolean.
   */
  supportTopping: z.boolean(),
  /**
   * @description Indicates whether this drink is available in different sizes (e.g., small, medium, large).
   * It's a required boolean.
   */
  supportSize: z.boolean(),
  /**
   * @description Optional URL to an image representing the drink.
   * 'z.string().url()' validates that if present, it must be a valid URL string.
   * '.optional()' means this property is not mandatory.
   */
  image: z.string().url().optional(),
});

/**
 * @description Zod schema defining the structure of a single sweetener option.
 * Examples: Sugar, Splenda, Stevia.
 */
export const SweetenerSchema = z.object({
  /**
   * @description The name of the sweetener. Required string.
   */
  name: z.string(),
  /**
   * @description A brief description of the sweetener. Required string.
   */
  description: z.string(),
  /**
   * @description Optional URL to an image representing the sweetener.
   */
  image: z.string().url().optional(),
});

/**
 * @description Zod schema defining the structure of a single syrup option.
 * Examples: Vanilla, Caramel, Hazelnut.
 */
export const SyrupSchema = z.object({
  /**
   * @description The name of the syrup. Required string.
   */
  name: z.string(),
  /**
   * @description A brief description of the syrup. Required string.
   */
  description: z.string(),
  /**
   * @description Optional URL to an image representing the syrup.
   */
  image: z.string().url().optional(),
});

/**
 * @description Zod schema defining the structure of a single topping option.
 * Examples: Whipped Cream, Chocolate Shavings, Cinnamon.
 */
export const ToppingSchema = z.object({
  /**
   * @description The name of the topping. Required string.
   */
  name: z.string(),
  /**
   * @description A brief description of the topping. Required string.
   */
  description: z.string(),
  /**
   * @description Optional URL to an image representing the topping.
   */
  image: z.string().url().optional(),
});

/**
 * @description Zod schema defining the structure of a single size option for drinks.
 * Examples: Small, Medium, Large.
 */
export const SizeSchema = z.object({
  /**
   * @description The name of the size. Required string.
   */
  name: z.string(),
  /**
   * @description A brief description of the size. Required string.
   */
  description: z.string(),
  /**
   * @description Optional URL to an image representing the size.
   */
  image: z.string().url().optional(),
});

/**
 * @description Zod schema defining the structure of a single milk option.
 * Examples: Whole Milk, Skim Milk, Almond Milk, Oat Milk.
 */
export const MilkSchema = z.object({
  /**
   * @description The name of the milk type. Required string.
   */
  name: z.string(),
  /**
   * @description A brief description of the milk type. Required string.
   */
  description: z.string(),
  /**
   * @description Optional URL to an image representing the milk.
   */
  image: z.string().url().optional(),
});

// --- Collections ---
// These schemas define arrays of the individual item schemas.

/**
 * @description Zod schema for an array (list) of ToppingSchema objects.
 * This would represent all available topping options.
 */
export const ToppingsSchema = z.array(ToppingSchema);
/**
 * @description Zod schema for an array (list) of SizeSchema objects.
 * This would represent all available drink sizes.
 */
export const SizesSchema = z.array(SizeSchema);
/**
 * @description Zod schema for an array (list) of MilkSchema objects.
 * This would represent all available milk options.
 */
export const MilksSchema = z.array(MilkSchema);
/**
 * @description Zod schema for an array (list) of SyrupSchema objects.
 * This would represent all available syrup flavors.
 */
export const SyrupsSchema = z.array(SyrupSchema);
/**
 * @description Zod schema for an array (list) of SweetenerSchema objects.
 * This would represent all available sweetener options.
 */
export const SweetenersSchema = z.array(SweetenerSchema);
/**
 * @description Zod schema for an array (list) of DrinkSchema objects.
 * This would represent the entire menu of drinks.
 */
export const DrinksSchema = z.array(DrinkSchema);

// --- Types (inferred from schemas) ---
// These lines use Zod's `z.infer` utility to automatically create TypeScript types
// based on the defined Zod schemas. This ensures type safety throughout the application
// and keeps the types in sync with the validation schemas.

/**
 * @description TypeScript type inferred from DrinkSchema.
 * Represents a single drink object.
 */
export type Drink = z.infer<typeof DrinkSchema>;
/**
 * @description TypeScript type inferred from SweetenerSchema.
 * Represents a single sweetener option.
 */
export type SupportSweetener = z.infer<typeof SweetenerSchema>;
/**
 * @description TypeScript type inferred from SyrupSchema.
 * Represents a single syrup option.
 */
export type Syrup = z.infer<typeof SyrupSchema>;
/**
 * @description TypeScript type inferred from ToppingSchema.
 * Represents a single topping option.
 */
export type Topping = z.infer<typeof ToppingSchema>;
/**
 * @description TypeScript type inferred from SizeSchema.
 * Represents a single size option.
 */
export type Size = z.infer<typeof SizeSchema>;
/**
 * @description TypeScript type inferred from MilkSchema.
 * Represents a single milk option.
 */
export type Milk = z.infer<typeof MilkSchema>;

/**
 * @description TypeScript type for an array of Topping objects.
 */
export type Toppings = z.infer<typeof ToppingsSchema>;
/**
 * @description TypeScript type for an array of Size objects.
 */
export type Sizes = z.infer<typeof SizesSchema>;
/**
 * @description TypeScript type for an array of Milk objects.
 */
export type Milks = z.infer<typeof MilksSchema>;
/**
 * @description TypeScript type for an array of Syrup objects.
 */
export type Syrups = z.infer<typeof SyrupsSchema>;
/**
 * @description TypeScript type for an array of Sweetener objects.
 */
export type Sweeteners = z.infer<typeof SweetenersSchema>;
/**
 * @description TypeScript type for an array of Drink objects.
 */
export type Drinks = z.infer<typeof DrinksSchema>;

// --- Structured Output Parsers ---
// THESE ARE ESSENTIAL FOR AI WORKFLOWS.
// An AI model (like an LLM) cannot directly understand TypeScript types or Zod schemas.
// It generates and understands plain text.
//
// The StructuredOutputParser serves two primary functions in this context:
// 1.  It generates a *textual instruction* (often in the form of a JSON schema string)
//     that can be included in the prompt given to the AI model. This instructs the AI
//     on the precise JSON format it should use for its output.
//     For example, it might tell the AI: "Your response MUST be a JSON object with 'name' (string)
//     and 'description' (string) properties, and a boolean 'supportMilk' property."
// 2.  It then takes the AI model's *raw text output* (which is hopefully a JSON string
//     following the instructions) and attempts to parse and validate it against
//     the underlying Zod schema. If successful, it transforms the text into a
//     fully typed and validated TypeScript object. If the AI's output doesn't match
//     the expected format, the parser will throw an error, preventing malformed data
//     from entering the application.
//
// In essence, these parsers are the bridge that allows us to leverage the type safety
// and validation of Zod for data generated by text-based AI models.

/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into a single, validated Drink object conforming to DrinkSchema.
 */
export const DrinkParser = StructuredOutputParser.fromZodSchema(
  DrinkSchema as any, // 'as any' is a TypeScript cast, sometimes used with libraries for flexibility.
);
/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into a single, validated Topping object conforming to ToppingSchema.
 */
export const ToppingParser = StructuredOutputParser.fromZodSchema(
  ToppingSchema as any,
);
/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into a single, validated Size object conforming to SizeSchema.
 */
export const SizeParser = StructuredOutputParser.fromZodSchema(
  SizeSchema as any,
);
/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into a single, validated Milk object conforming to MilkSchema.
 */
export const MilkParser = StructuredOutputParser.fromZodSchema(
  MilkSchema as any,
);
/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into a single, validated Syrup object conforming to SyrupSchema.
 */
export const SyrupParser = StructuredOutputParser.fromZodSchema(
  SyrupSchema as any,
);
/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into a single, validated Sweetener object conforming to SweetenerSchema.
 */
export const SweetenerParser = StructuredOutputParser.fromZodSchema(
  SweetenerSchema as any,
);

// Parsers for arrays
// These are used when the AI model is expected to output a list of items,
// which the parser will then transform into a TypeScript array of validated objects.

/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into an array of validated Drink objects conforming to DrinksSchema.
 */
export const DrinksParser = StructuredOutputParser.fromZodSchema(
  DrinksSchema as any,
);
/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into an array of validated Topping objects conforming to ToppingsSchema.
 */
export const ToppingsParser = StructuredOutputParser.fromZodSchema(
  ToppingsSchema as any,
);
/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into an array of validated Size objects conforming to SizesSchema.
 */
export const SizesParser = StructuredOutputParser.fromZodSchema(
  SizesSchema as any,
);
/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into an array of validated Milk objects conforming to MilksSchema.
 */
export const MilksParser = StructuredOutputParser.fromZodSchema(
  MilksSchema as any,
);
/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into an array of validated Syrup objects conforming to SyrupsSchema.
 */
export const SyrupsParser = StructuredOutputParser.fromZodSchema(
  SyrupsSchema as any,
);
/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into an array of validated Sweetener objects conforming to SweetenersSchema.
 */
export const SweetenersParser = StructuredOutputParser.fromZodSchema(
  SweetenersSchema as any,
);

If you went through the file, you may have spotted something like StructuredOutputParser.fromZodSchema() . This utility function is very important, since LLMs don’t understand Typescript interfaces or Zod interfaces, this function allows us to transform our data schema into human and LLM-readable strings, which an LLM can easly understand. In brief, it takes schemas and converts them into their string representation so that they can be part of an AI prompt. Like, “hey, generate a list of objects that follow this specific interface”

lib/util/schemas/order.ts

import z from 'zod';

// Imports the 'StructuredOutputParser' from 'langchain/output_parsers'.
// This is critical for connecting our strict TypeScript data definitions with
// the text-based outputs of AI models.
// Remember: AI models operate on text, not directly on TypeScript types or Zod schemas.
import { StructuredOutputParser } from 'langchain/output_parsers';

/**
 * @description Zod schema defining the structure for a single customer order.
 * This schema dictates the precise format and types expected for an order object.
 */
export const OrderSchema = z.object({
  /**
   * @description The name of the drink ordered (e.g., "Latte", "Cappuccino").
   * This is a required string.
   */
  drink: z.string(),
  /**
   * @description The size of the drink ordered (e.g., "Small", "Medium", "Large").
   * This is a required string.
   */
  size: z.string(),
  /**
   * @description The type of milk specified for the drink (e.g., "Whole Milk", "Almond Milk").
   * This is a required string.
   * NOTE: Typo 'mil' should likely be 'milk' for consistency.
   */
  mil: z.string(),
  /**
   * @description The type of syrup added to the drink (e.g., "Vanilla", "Caramel").
   * This is a required string.
   */
  syrup: z.string(),
  /**
   * @description The sweetener(s) chosen for the drink (e.g., "Sugar", "Splenda", "None").
   * This is a required string.
   */
  sweeteners: z.string(),
  /**
   * @description The topping(s) selected for the drink (e.g., "Whipped Cream", "Cinnamon", "None").
   * This is a required string.
   */
  toppings: z.string(),
  /**
   * @description The quantity of this specific drink order.
   * It's a required number, with a minimum value of 1 and a maximum of 10.
   */
  quantity: z.number().min(1).max(10),
});

/**
 * @description TypeScript type inferred from OrderSchema.
 * This type provides strong type-checking for order objects throughout the application.
 */
export type OrderType = z.infer<typeof OrderSchema>;

/**
 * @description StructuredOutputParser for parsing unstructured text from an AI model
 * into a single, validated Order object.
 *
 * **CRITICAL FOR AI INTEGRATION:**
 *
 * The `StructuredOutputParser` here is vital because an AI model, such as a Large Language Model (LLM),
 * cannot directly understand or generate TypeScript types or Zod schemas. LLMs process and output plain text.
 *
 * This parser enables two key functionalities for working with AI:
 *
 * 1.  **Prompt Instruction Generation:** The `OrderParser` can generate a textual prompt
 * (often a JSON schema string) that you can send to the AI model. This prompt tells the AI
 * *exactly* what format its output should take—for instance, "Your response must be a JSON object
 * with keys 'drink', 'size', 'mil', 'syrup', 'sweeteners', 'toppings' (all strings),
 * and 'quantity' (a number between 1 and 10)." This helps guide the AI to produce structured data.
 *
 * 2.  **Output Validation and Transformation:** After the AI model generates its response (which,
 * if prompted correctly, should be a JSON string), the `OrderParser` takes this raw text.
 * It then attempts to parse this text into a JavaScript object and validates that
 * object against the `OrderSchema`. If the parsing and validation are successful,
 * it transforms the raw text into a type-safe `Order` object that can be used
 * directly in your TypeScript application. If the AI's output is malformed or
 * doesn't meet the schema's requirements (e.g., quantity is 0 or 11), the parser
 * will throw an error, preventing invalid data from corrupting your application.
 *
 * In essence, `OrderParser` acts as a crucial bridge, translating between the AI's
 * text-based world and our application's type-safe, structured data requirements.
 */
export const OrderParser = StructuredOutputParser.fromZodSchema(
  OrderSchema as any, // 'as any' is a TypeScript type assertion, commonly used with external libraries for flexibility.
);

This file describes how an order should look, as I mentioned in the previous explanation

export const OrderParser = StructuredOutputParser.fromZodSchema

Will generate a text version of how an order should look. It’s from those instructions that the LLM will know how to format an order.

Data summarization

Data summarization in the context of LLM agents consists of breaking complex data structures into digestible strings that can be easly understood by LLMs, even the most powerful LLMs in the world are text input and output machines. The more the prompt is understandable, the more accurate the LLM will be.

The data summarization for our project can be found in the file

lib/util/summaries/index.ts

// Imports the 'Drink' type from the schema definition file.
// This ensures type safety when working with drink objects.
import { Drink } from '../schemas/drinks';

// Imports various constant arrays containing menu data (sweeteners, milks, syrups, sizes, toppings).
// These constants provide the detailed information used to construct the summaries.
import {
  SWEETENERS,
  MILKS,
  SYRUPS,
  SIZES,
  TOPPINGS,
} from '../utils/constants/menu_data';

/**
 * @description Generates a concise textual summary for a given drink item.
 * This function takes a `Drink` object and constructs a descriptive string
 * detailing its name, description, and supported customization options (milk, sweeteners, syrup, topping, size).
 *
 * This summary is particularly useful for AI models or user-facing descriptions
 * where a natural language explanation of a drink's features is required,
 * converting structured boolean flags into readable sentences.
 *
 * @param drink The Drink object for which to create the summary.
 * @returns A string containing the comprehensive summary of the drink's features.
 */
export const createDrinkItemSummary = (drink: Drink): string => {
  // Constructs the base name part of the summary.
  const drinkName = 'A drink named ' + drink.name;
  // Constructs the description part of the summary.
  const drinkDescription = 'It is described as ' + drink.description;

  // Conditionally adds text based on the 'supportMilk' boolean flag.
  const milkSupport = drink.supportMilk
    ? 'It can be made with milk.'
    : 'It cannot be made with milk.';
  // Conditionally adds text based on the 'supportSweeteners' boolean flag.
  const sweetenerSupport = drink.supportSweeteners
    ? 'It can be made with sweeteners.'
    : 'Drink cannot contain sweeteners.'; // Corrected "can not" to "cannot"
  // Conditionally adds text based on the 'supportSyrup' boolean flag.
  const syrupSupport = drink.supportSyrup
    ? 'It can be made with syrup.'
    : 'It cannot be made with syrup.';
  // Conditionally adds text based on the 'supportTopping' boolean flag.
  const toppingSupport = drink.supportTopping
    ? 'It can be made with topping.'
    : 'It cannot be made with topping.';
  // Conditionally adds text based on the 'supportSize' boolean flag.
  const sizeSupport = drink.supportSize
    ? 'It can be made in different sizes.'
    : 'It cannot be made in different sizes.';

  // Concatenates all parts into a single summary string.
  return `${drinkName} ${drinkDescription} ${milkSupport} ${sweetenerSupport} ${syrupSupport} ${toppingSupport} ${sizeSupport}`;
};

/**
 * @description Generates a textual summary of all available sweetener options.
 * It iterates through the `SWEETENERS` array and lists each sweetener's
 * name and description in a human-readable, bullet-point format.
 *
 * This function is useful for providing an AI model or user with a clear,
 * comprehensive overview of customization choices.
 *
 * @returns A string containing the summary of available sweeteners.
 */
export const createSweetenersSummary = (): string => {
  return `Available sweeteners are:
${SWEETENERS.map((sweetener) => `- ${sweetener.name}: ${sweetener.description}`).join('\n')}
   `;
};

/**
 * @description Generates a textual summary of all available milk options.
 * It iterates through the `MILKS` array and lists each milk type's
 * name and description in a human-readable, bullet-point format.
 *
 * This function is useful for providing an AI model or user with a clear,
 * comprehensive overview of customization choices.
 *
 * @returns A string containing the summary of available milk types.
 */
export const createAvailableMilksSummary = (): string => {
  // Improved function name
  return `Available milks are:
${MILKS.map((milkOption) => `- ${milkOption.name}: ${milkOption.description}`).join('\n')}
   `;
};

/**
 * @description Generates a textual summary of all available syrup options.
 * It iterates through the `SYRUPS` array and lists each syrup's
 * name and description in a human-readable, bullet-point format.
 *
 * This function is useful for providing an AI model or user with a clear,
 * comprehensive overview of customization choices.
 *
 * @returns A string containing the summary of available syrups.
 */
export const createSyrupsSummary = (): string => {
  // Improved function name
  return `Available syrups are:
${SYRUPS.map((syrupOption) => `- ${syrupOption.name}: ${syrupOption.description}`).join('\n')}
   `;
};

/**
 * @description Generates a textual summary of all available drink sizes.
 * It iterates through the `SIZES` array and lists each size's
 * name and description in a human-readable, bullet-point format.
 *
 * This function is useful for providing an AI model or user with a clear,
 * comprehensive overview of customization choices.
 *
 * @returns A string containing the summary of available drink sizes.
 */
export const createSizesSummary = (): string => {
  // Improved function name
  return `Available sizes are:  
    ${SIZES.map((sizeOption) => `- ${sizeOption.name}: ${sizeOption.description}`).join('\n')}
    `;
};

/**
 * @description Generates a textual summary of all available topping options.
 * It iterates through the `TOPPINGS` array and lists each topping's
 * name and description in a human-readable, bullet-point format.
 *
 * This function is useful for providing an AI model or user with a clear,
 * comprehensive overview of customization choices.
 *
 * @returns A string containing the summary of available toppings.
 */
export const availableToppingsSummary = (): string => {
  return `Available toppings are:
${TOPPINGS.map((toppingOption) => `- ${toppingOption.name}: ${toppingOption.description}`).join('\n')}
   `;
};

Now that we have the core foundations of our application, let’s establish a database connection so that we can be able to persist orders.

src/app.module.ts

import { Module } from '@nestjs/common';
import { AppController } from './app.controller';
import { AppService } from './app.service';
import { ChatsModule } from './chats/chats.module';
import { MongooseModule } from '@nestjs/mongoose';

@Module({
  imports: [MongooseModule.forRoot(process.env.MONGO_URI), ChatsModule],
  controllers: [AppController],
  providers: [AppService],
})
export class AppModule {}

src/chats/schemas/order.schema.ts A MongoDB schema for the order

Please note that we are using MongoDB for storing orders and conversations.

Time to write the agent logic 🤩

src/chats/chats.service.ts

// Core NestJS and MongoDB Imports
import { Injectable } from '@nestjs/common'; // Decorator to mark a class as a provider that can be injected.
import { MongoClient } from 'mongodb'; // MongoDB client for database connection.
import { InjectModel } from '@nestjs/mongoose'; // Decorator to inject Mongoose models into services.
import { Model } from 'mongoose'; // Mongoose Model type for interacting with MongoDB.

// Langchain and LangGraph Imports
import { tool } from '@langchain/core/tools'; // Utility for defining tools that an AI agent can use.
import {
  ChatPromptTemplate,
  MessagesPlaceholder,
} from '@langchain/core/prompts'; // Used to construct prompt templates for LLMs, including placeholders for dynamic content.
import { ChatGoogleGenerativeAI } from '@langchain/google-genai'; // Langchain integration for Google's Generative AI models (e.g., Gemini).
import { StateGraph } from '@langchain/langgraph'; // Core LangGraph class for defining stateful AI agent workflows.
import { ToolNode } from '@langchain/langgraph/prebuilt'; // A pre-built LangGraph node specifically for executing tools.
import { Annotation } from '@langchain/langgraph'; // Used for defining the state schema in LangGraph with reducer logic.
import { AIMessage, BaseMessage, HumanMessage } from '@langchain/core/messages'; // Types for various message roles in a chat conversation.
import { START, END } from '@langchain/langgraph'; // Special symbols for defining the start and end nodes of a LangGraph.
import { MongoDBSaver } from '@langchain/langgraph-checkpoint-mongodb'; // Checkpointer for saving/loading graph state to MongoDB.

// Application-Specific Schema and Data Imports
import { Order } from 'src/data/schema/order.schema'; // Mongoose schema/model for the Order document in MongoDB.
import {
  OrderSchema as OrderSchemaData, // Zod schema defining the structure of an order for validation.
  OrderParser, // Langchain StructuredOutputParser for the OrderSchemaData. IMPORTANT for AI.
  Order as OrderType, // TypeScript type inferred from OrderSchemaData.
} from 'src/util/schemas/orders/Order.schema';
import { DrinkParser } from 'src/util/schemas/drinks/Drink.schema'; // Langchain StructuredOutputParser for the Drink schema.
import { DRINKS } from 'src/util/constants/drinks_data'; // Constant array containing all available drink definitions.

// Data Summarization Utilities
import {
  availableToppingsSummary, // Function to summarize available toppings.
  createDrinkItemSummary, // Function to summarize a single drink item.
  createSweetenersSummary, // Function to summarize available sweeteners.
  createAvailableMilksSummary, // Function to summarize available milks.
  createSyrupsSummary, // Function to summarize available syrups.
  createSizesSummary, // Function to summarize available sizes.
} from 'src/util/summaries/drink'; // Note: Corrected function names based on previous improvements.

import z from 'zod'; // Zod library for schema definition and validation.

// Configuration from Environment Variables
const GOOGLE_API_KEY = process.env.GOOGLE_API_KEY || ''; // Google API key for Generative AI.
const client: MongoClient = new MongoClient(process.env.MONGO_URI || ''); // MongoDB client instance for database connection.
const database_name = 'drinks_db'; // Name of the MongoDB database to use.

/**
 * @description A NestJS service responsible for orchestrating the AI chat agent.
 * This service uses LangGraph to manage conversational state and logic, integrates
 * Google's Generative AI for natural language understanding, and interacts with
 * MongoDB for persisting order data.
 */
@Injectable()
export class ChatService {
  /**
   * @description Constructor for the ChatService.
   * Injects the Mongoose Order model, allowing the service to interact with the 'orders' collection.
   * @param orderModel Mongoose Model for the Order document.
   */
  constructor(@InjectModel(Order.name) private orderModel: Model<Order>) {}

  /**
   * @description Main function to interact with the AI chat agent.
   * This asynchronous function processes a user query within a specific conversation thread,
   * leverages a LangGraph agent to generate a response, potentially creates an order,
   * and saves/loads conversation state using a MongoDB checkpointer.
   *
   * @param thread_id A unique identifier for the conversation thread, used for state management.
   * @param query The user's input message.
   * @returns A JSON object containing the AI's response, current order state, suggestions, and progress.
   */
  chatWithAgent = async (thread_id: string, query: string) => {
    // Ensure connection to MongoDB is established.
    await client.connect();

    // LangGraph State Definition
    // Defines the structure of the conversational state that the graph will manage.
    // 'messages' is an array of BaseMessage, and the reducer ensures new messages
    // are appended to the existing list, maintaining conversation history.
    const graphState = Annotation.Root({
      messages: Annotation<BaseMessage[]>({ reducer: (x, y) => [...x, ...y] }),
    });

    // Define the 'create_order' tool for the AI agent.
    // This tool allows the AI to programmatically interact with the application's backend
    // to create a new order in the database.
    const orderTool = tool(
      // The actual asynchronous function that gets executed when the tool is called by the AI.
      async ({ order }: { order: OrderType }) => {
        try {
          console.log({ order }); // Log the order object for debugging.
          await this.orderModel.create(order); // Persist the order to the MongoDB database.
          return 'Order created successfully'; // Return a success message.
        } catch (error) {
          console.log(error); // Log any errors during order creation.
          return 'Failed to create the order'; // Return a failure message.
        }
      },
      {
        // Schema definition for the tool's input.
        // This is crucial for guiding the LLM on what arguments to provide when calling this tool.
        // It uses Zod to define an object expecting an 'order' property that conforms to OrderSchemaData.
        schema: z.object({
          order: OrderSchemaData.describe(
            'This is the order that will be passed to ',
          ), // Adds a description for the 'order' parameter, further aiding LLM understanding.
        }),
        name: 'create_order', // The name the AI will use to refer to this tool.
        description: 'Creates a new order in the database', // A description for the AI to understand the tool's purpose.
      },
    );

    // Array of all tools available to the AI agent.
    const tools = [orderTool];

    // --- AI Call Node Definition ---
    // This function defines the logic for the 'agent' node in the LangGraph.
    // It's responsible for constructing the prompt, invoking the LLM, and returning its response.
    const callModal = async (states: typeof graphState.State) => {
      // Constructs the chat prompt for the LLM.
      // It includes a detailed system message, current time, and a placeholder for previous messages.
      const prompt = ChatPromptTemplate.fromMessages([
        {
          role: 'system',
          content: `
            Your are a helpful assistant that helps people buy drinks from starbucks.
            You take the user request and find missing details based on how a full order looks like.
            A full order looks like this:  ${OrderParser.getFormatInstructions()}.

            *IMPORTANT
            You have access to a create_order tool, this tool is used to create orders in the database and you should call it when
            you want to create an order.

            You should confirm the order once the tool call has been successful, and if it fails you inform the user.

            Each drink has its own set of properties like size, milk, syrup, sweetener, topping.   
            Here is how a drink schema looks like : ${DrinkParser.getFormatInstructions()}
            Make sure to ask for any missing details in the order before creating it.
            If the user user asks for modification thats can not be done based on the choosen drink just tell them its not possible.
            If the user asks for something that is not related to the order just politely tell them you can only help with drink orders.

            Here is the list of all available drinks and what they can accept as modifications : ${DRINKS.map((drink) => `-${createDrinkItemSummary(drink)}`)} 

            Here is the list all available sweeteners : ${createSweetenersSummary()},
            Here is a list of all available toppings: ${availableToppingsSummary()}
            Here is the list of all available milks: ${createAvailableMilksSummary()}
            Here is the list of all available syrups: ${createSyrupsSummary()}
            Here is the list of all available sizes: ${createSizesSummary()}


            Here is how the order schema look like: ${OrderParser.getFormatInstructions()}.

            If the query is not clear you should tell the user that the query is not clear.

            **VERY IMPORTANT
            Once the order is ready you should ask the user to confirm and if thet do you should call the create_order right away 
            and only come back to the user once the order has been confirmed or failed.

            In the response you should include this part it's used by the frontend to track the progress of the current order and to track chats. It's an object expressed in json format:

            "message": "The ai response(example do you want it with some sugar)",
            "current_order": "The current order that is being constructed",
            "suggestions": "The list of options the user can choose from based on your message",
            "progress": "This field is used to show wether the order has been placed or not, once the order has been placed it should be "completed". it should be part of the order, once the user confirms the order you call the create_order tool directly then you mark the progress as "completed" "

            **IMPORTANT 
            Be friendly and use emojis what you want to add some humor.

            **IMPORTANT
            For fields that haven't been filled yet you place null.
            Never miss this part in any message you send 
          `,
        },
        new MessagesPlaceholder('messages'), // Placeholder for injecting the current conversation history.
      ]);

      // Formats the prompt with dynamic data (current time and messages from graph state).
      const formattedPrompt = await prompt.formatMessages({
        time: new Date().toISOString(),
        messages: states.messages,
      });

      // Initializes the Google Generative AI chat model and binds the available tools to it.
      // Binding tools allows the LLM to 'decide' when and how to call the defined tools.
      const chat = new ChatGoogleGenerativeAI({
        model: 'gemini-2.0-flash', // Specifies the AI model to use.
        temperature: 0, // Sets a low temperature for more deterministic and factual responses.
        apiKey: GOOGLE_API_KEY,
      }).bindTools(tools); // Informs the LLM about the functions it can call (our `orderTool`).

      // Invokes the LLM with the formatted prompt.
      const result = await chat.invoke(formattedPrompt);
      // Returns the LLM's response as a message, updating the graph state.
      return { messages: [result] };
    };

    // --- Conditional Edge Logic ---
    // This function determines the next step in the LangGraph based on the AI's response.
    // If the AI has decided to call a tool, the graph transitions to the 'tools' node;
    // otherwise, the conversation ends.
    const shouldContinue = (state: typeof graphState.State) => {
      const messages = state.messages;
      const lastMessage = messages[messages.length - 1] as AIMessage; // Cast the last message to AIMessage to access tool_calls.
      // If the last AI message contains tool calls, go to the 'tools' node; otherwise, end.
      return lastMessage.tool_calls?.length ? 'tools' : END;
    };

    // --- Tool Node Definition ---
    // A pre-built LangGraph node that executes any tool calls specified by the LLM.
    const toolsNode = new ToolNode<typeof graphState.State>(tools);

    // --- LangGraph Construction ---
    // Builds the conversational workflow graph.
    const graph = new StateGraph(graphState)
      .addNode('agent', callModal) // Adds the AI interaction as the 'agent' node.
      .addNode('tools', toolsNode) // Adds the tool execution as the 'tools' node.
      .addEdge(START, 'agent') // The conversation always starts by going to the 'agent' (LLM call).
      .addConditionalEdges('agent', shouldContinue) // After the 'agent' responds, decide whether to go to 'tools' or END.
      .addEdge('tools', 'agent'); // After a tool is executed, return to the 'agent' for further processing/response.

    // Initialize MongoDB checkpointer for state persistence.
    // This allows the conversation state to be saved and loaded across multiple turns,
    // enabling long-running, stateful conversations.
    const checkpointer = new MongoDBSaver({ client, dbName: database_name });

    // Compile the graph into a runnable application.
    const app = graph.compile({ checkpointer });

    // Invoke the LangGraph application with the initial user query.
    // `thread_id` is passed as a configurable option to the checkpointer
    // to identify and load/save the correct conversation state.
    const finalState = await app.invoke(
      { messages: [new HumanMessage(query)] }, // Initial message from the user.
      { recursionLimit: 15, configurable: { thread_id } }, // Set recursion limit and pass thread_id for state.
    );

    /**
     * @description Helper function to extract JSON objects from a string response.
     * The LLM is instructed to embed its structured response within a JSON code block.
     * This function parses that block.
     * @param response The raw string response from the AI.
     * @returns The parsed JSON object.
     * @throws If no valid JSON block is found or parsing fails.
     */
    function extractJson(response: any) {
      console.log(response); // Log the raw response for debugging.
      // Regex to find a JSON block enclosed in json ... 
      const match = response.match(/```
{% endraw %}
json\s*([\s\S]*?)\s*
{% raw %}
```/i);
      if (match && match[1] && typeof response === 'string') {
        return JSON.parse(match[1].trim()); // Parse and return the JSON content.
      }
      throw response; // If no JSON is found, throw the original response.
    }

    // Retrieve the last message from the final graph state.
    const lastMessage = finalState.messages.at(-1) as AIMessage;
    // Extract and return the JSON part of the last AI message.
    return extractJson(lastMessage.content);
  };
}

Well, that was quite too much let me unpack. In the example above, we implement an AI agent capable of guiding the user through the ordering process and deciding when to process an order using the provided tools. To achieve this, we started by initializing our state store, aka the Annotation.Root The annotation root can be seen as the single source of truth for the state store. Once the workflow has been compiled, what happens internally is that the framework calls the database, initializes all previous conversations based on the thread ID, and passes them to the state store so that each node of our graph can access the conversation history. With the state store created, we proceeded with creating tools for our AI agent with the langchain-core built-in tool, tool() utility function. With this superpower, we defined how our tool should be used, what it is used for, and how data should be passed to it. You should be very cautious with this step since sending invalid data that does not follow what you specified in the schema validation will simply cause your application to crash.

After creating our toolset and initializing our states, we went ahead and created the function for calling the LLM(large language model). I don’t see anything particularly complex that requires and deep explanation, so I will skip this part😅. Feel free to leave your questions in the comment section.

With all of that in place, we went to the final part, and most important of all, the graph definition.

Graph definition(hence the name LangGraph).

const graph = new StateGraph(graphState)

The graph is where all the magic happens. A graph is basically a state machine of available tools or nodes in our workflow, and how those tools interact with each other. Each framework has its own conventions and tools. With langChain, you start by defining what nodes are available in the workflow using the .addNode('agent', callModal) method

new StateGraph(graphState)
      .addNode('agent', callModal)

Then we define the workflow pipeline, describing what node to call first

   .addEdge(START, 'agent')

In our example, we decided to call the agent node(LLM).

Then we created a conditional edge function that reads through out the last message and see if there is a tool-call request, if it doesn’t find one it signals the workflow that there is nothing more to invoke and the workflow should stop.


/**
     * A conditional edge function that determines the next step in the graph.
     * If the last AI message contains tool calls, the graph moves to the
     * 'tools' node; otherwise, the conversation ends.
     * @param state The current state of the graph.
     * @returns The name of the next node ('tools') or END.
     */
    const shouldContinue = (state: typeof graphState.State) => {
      const messages = state.messages;
      const lastMessage = messages[messages.length - 1] as AIMessage;
      return lastMessage.tool_calls?.length ? 'tools' : END;
 };


   .addConditionalEdges('agent', shouldContinue)

And at last we have .addEdge('tools', 'agent'); This function instructs the workflow how to order calls, for instance, here we say, after invoking the tools node, return immediately to the LLM with the results you got for further planning and decision making based on the tool response.

Here is the full example from our project

 const graph = new StateGraph(graphState)
      .addNode('agent', callModal)
      .addNode('tools', toolsNode)
      .addEdge(START, 'agent')
      .addConditionalEdges('agent', shouldContinue)
      .addEdge('tools', 'agent');

Checkpoints and conversation history

In our example, we use MongoDBSaver , a powerful library provided by the Langchain team for storing conversations within a MongoDB database. The library is very easy to use and straightforward. All conversations are grouped based on a thread ID, and all the data retrieval is managed by LangGraph, so sweet right 😙

  // Initializes a MongoDB checkpointer to save and load graph state.
    const checkpointer = new MongoDBSaver({ client, dbName: database_name });

    // Compiles the graph with the checkpointer.
    const app = graph.compile({ checkpointer });

Et voila, then you compile your workflow with the thread_id, and that’s all you have to do.

    // Compiles the graph with the checkpointer.
    const app = graph.compile({ checkpointer });

    /**
     * Invokes the compiled graph with the user's initial query.
     * The `configurable` object is used by the checkpointer to identify the thread.
     */
    const finalState = await app.invoke(
      { messages: [new HumanMessage(query)] },
      { recursionLimit: 15, configurable: { thread_id } },
    );

Well, if you are confused, don’t worry, I was too 😂😅, with more practice, everything will start to make sense. I will link all the source codes and materials that I used to come up with this project.

source code: https://github.com/DjibrilM/langgraph-starbucks-agent

Resources

LangGraph documentation: https://langchain-ai.github.io/langgraphjs/tutorials/quickstart/?ajs_aid=47d71e7b-4bbd-4197-9569-da6c0e33043f&ajs_uid=19f1a91a-c80f-40b9-8cf4-fb7956e0b5f0&_gl=1*12rj3ep*_gcl_au*MTI4MDQyOTc0Mi4xNzU2ODU5NTMz*_ga*MTk4OTkxMjk2Ny4xNzU2ODU5NTMz*_ga_47WX3HKKY2*czE3NTg2MjUyOTgkbzckZzAkdDE3NTg2MjUyOTgkajYwJGwwJGgw

Synergizing Reasoning and Acting in Language Models : https://arxiv.org/abs/2210.03629