DEV Community

Cover image for LLM Super Powers with Langchain Agents
Kevin Naidoo
Kevin Naidoo

Posted on • Originally published at kevincoder.co.za

LLM Super Powers with Langchain Agents

Langchain is one of my favorite tools at the moment because of how it simplifies complex machine-learning tasks.

In some cases you may need more than just a well-written prompt, you may want to trigger different data sources or actions based on what the user asks.

For example: in an e-commerce site, if the user asks to view a list of "shoes", you probably would do a keyword/semantic search and return a list of matching shoes.

If they ask about delivery information, you may want to look up that data from an SQL DB or API.

Langchain agents are a powerful mechanism at your disposal that will enable you to build complex custom LLM chatbots. In this article, we will go over what is an agent and how to build one.

What is a Langchain Agent?

When you use an LLM like OpenAI's "chatgpt-turbo", you typically will send the model a prompt consisting of one or more messages, and the LLM will respond accordingly.

So essentially text in and text out, this is okay for a chatbot that is performing one particular task like answering support questions, but what if you need to change the data source or do a web search or record something in the DB?

Agent Illunstration

This is where Agents come in handy; an Agent is basically a "task executor". It will allow the LLM to execute functions and other code in your application based on reasoning and user input.

Think in terms of a switch statement, depending on which pathway is true, the switch statement will execute that particular block of code.

Agents are not exclusive to Langchain. Each LLM has a different way of handling agents, however, Langchain just provides a consistent API to work with regardless of which backend LLM you are using.

Tools

Since agents are task executors, we need some kind of "callback" for the agent to execute such as a function or class.

Tools take in, either the raw prompt or a list of arguments and return some sort of output. Usually, you would return a string, but it's also possible to return more complex data like a LangChain document.

There are multiple ways of declaring a tool. We will cover the decorator approach since it's the most common and easiest solution to understand.

Here is an example:

from langchain.tools import tool

@tool
def search_delivery_information(prompt :str) -> str:
    """ When the user requests delivery information """

    # We now return some text from an external API
    # The LLM will analyze this text and 
    # - return the appropriate answer to the user.
    return requests.get("/somewhere/delivery.json").text
Enter fullscreen mode Exit fullscreen mode

Three essential components make up a tool:

1) @tool - This decorator will take care of handling the input/output of your function in a way that the LLM can understand.
2) """ - The docstring; think of this as a prompt system message to the LLM, where you tell the LLM when to execute this function and provide any other useful context data.
3) The returned data. The LLM will ingest your function's response as context data and scope its response to that context.

To clarify what I mean by "scope", without any context data, if the user asks: "Which is the best shoe brand?", it will respond with some generic response based on the LLMs training data, similar to asking ChatGPT a question, so probably it may respond with "Nike" or "Reebok".

However, if the tool returns "Addidas is our best brand.", then the LLM's response will regard "Addidas" as the best brand and not "Nike" or "Reebok".

Putting it all together

Okay great! Now you know what an agent is and how to create your custom callback functions to help the LLM better answer the user's question.

Let's now build an Agent and link it to our custom tool:

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents.format_scratchpad.openai_tools import (
    format_to_openai_tool_messages,
)
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser

# Create a standard Chat LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo-0125", temperature=0)

# Create a list of all tools you want to enable
tools = [search_delivery_information]

# Connect our tools to the LLM
llm_with_tools = llm.bind_tools(tools) 

# Build a chat prompt.
# Notice we have placeholders for the user's input
# - and a second placeholder for the Agent's context data.
prompt = ChatPromptTemplate.from_messages(
        [
            (
                "system",
               f"You are an e-commerce assistant.",
            ),
            ("user", "{input}"),
           MessagesPlaceholder(variable_name="agent_ctx"),
        ]
)

# Next we build the actual agent.
agent = (
        {
            "input": lambda x: x["input"],
            "agent_ctx": lambda x: format_to_openai_tool_messages(
                x["intermediate_steps"]
            )
        }
        | prompt
        | llm_with_tools
        | OpenAIToolsAgentOutputParser()
)
Enter fullscreen mode Exit fullscreen mode
        {
            "input": lambda x: x["input"],
            "agent_ctx": lambda x: format_to_openai_tool_messages(
                x["intermediate_steps"]
            )
        }
Enter fullscreen mode Exit fullscreen mode

The above block might seem confusing at first glance, but basically, the first argument is the user's input, the prompt template created earlier will replace "{input}" with the actual user's question or message.

Secondarily: "agent_ctx", since our tool callback functions are just Python functions, there needs to be a translation step that converts the output from these functions into something that the model can understand and the agent can transmit via the REST API.

You will also notice we chain one other object at the end "OpenAIToolsAgentOutputParser", this will receive data and convert it into a format that the agent can understand.

Finally, we can instantiate an agent executor and prompt the LLM:

from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=False)
result = agent_executor.invoke({"input": question})
print(result)
Enter fullscreen mode Exit fullscreen mode

Top comments (2)

Collapse
 
programcrafter profile image
ProgramCrafter

I feel this post would be better with section about security, since if you pass user's input to the LLM then you should validate its requests to your servers

Collapse
 
kwnaidoo profile image
Kevin Naidoo

Thanks for the feedback, yeah 100%, this was just a primer on the concepts and not to be taken as copy and paste. All web best practices should still apply here.

I will, however, look at adding a section to explain common security best practices.