Stoyan

Posted on Aug 3, 2023

Introduction to LLMFlows

#ai #machinelearning #mlops #opensource

Why a new framework?

Earlier this year, at my new workplace, we started getting more serious about building LLM-powered apps. I've previously built and trained task-specific language models, but following the introduction of GPT-3 and ChatGPT, I found myself increasingly relying on the OpenAI APIs. At the same time, langchain was starting to get very popular, so I was super excited to try it out for one of the PoCs we were building.

As I usually do when trying a new framework, I immediately headed to the official documentation page. My first impression was that the resources were somewhat insufficient, and I had trouble running some examples since some classes were deprecated. But such issues were more or less expected since the project was relatively new.

I was still committed to learning the framework, so I headed to GitHub. However, after investing some time in understanding a good portion of the classes and abstractions, I had a rather anticlimactic realization. I remember sitting next to my colleague and saying something along the lines of, "Wow! It's just string manipulations and API calls. Why does it need to be so complicated? ".

After my initial encounter, my feeling was that langchain made it easy to start developing LLM apps, but if you wanted to build something more customized, there were a few aspects that needed improvement:

There were too many classes and abstractions.
The core agent and chain classes had default hidden prompts and were too opinionated while being hard to customize.
Debugging the classes and understanding their inner workings was quite challenging.

As is often the case with these conversations, we finished with the classic "I bet I can do it better in a weekend" half-joking conclusion.

Well, here we are a few months later.

Meet LLMFlows

LLMFlows is a half-jokingly-started weekend project that evolved into a serious attempt to create a better framework for building LLM(Large Language Models) applications.

At its core, LLMFlows is based on three fundamental principles:

Simple

Create a well-documented framework with minimal abstractions without compromising on capabilities. It's just string manipulations and API calls - why does it have to be so complicated?

Explicit

Define an explicit API enabling users to write clean and readable code while easily creating flows of LLMs interacting with each other. Classes should not have hidden prompts or opinionated logic. The developer should have complete control and be in the driver's seat.

Transparent

Help users have full transparency over their LLM-powered apps by providing complete information for each prompt, LLM call, and all other components, making monitoring, maintaining, and debugging easy.

In this introductory post, we will review some of the main abstractions of LLMFlows and cover some basic patterns when building LLM-powered applications. In the end, we will see how we can use the Flow and FlowStep classes to easily create explicit and transparent LLM apps.

Installation

If you want to code along as you read this post you can install LLMFlows using pip:

pip install llmflows

To check the source code and examples, visit our Github repository. The official documentation and user guides are available at https://llmflows.readthedocs.io.

LLMs

LLMs are one of the main abstractions in LLMFlows. LLM classes are wrappers around LLM APIs such as OpenAI's APIs. They provide methods for configuring and calling these APIs, retrying failed calls, and formatting the responses.

OpenAI's GPT-3 is one of the commonly used LLMs and is available through their completion API. The LLMFlows' OpenAI class is a wrapper around this API. It can be configured in the following way:

from llmflows.llms import OpenAI

openai_llm = OpenAI(api_key="<your-openai-api-key>")

All LLM classes have .generate() and .generate_async() methods for generating text. The only thing we need to provide to start generating text is a prompt string:

prompt_result, call_data, model_config = openai_llm.generate(
   prompt="Generate a cool title for an 80s rock song"
)

The .generate() method returns the generated text completion, API call information, and the configuration used to make the call:

print(prompt_result)

"Living On The Edge of Time"

Chat LLMs

Chat LLMs gained popularity after ChatGPT was released and the chat completions API from OpenAI became publicly available. LLMFlows provides an OpenAIChat class that is an interface for this API.

Regular LLMs like GPT-3 require just an input prompt to make a completion. On the other hand, chat LLMs require a conversation history. The conversation history is represented as a list of messages between a user and an assistant. This conversation history is sent to the model, and a new message is generated based on it.

OpenAI's chat completion API supports three message types in its conversation history:

system (system message specifying the behavior of the LLM)
user (message by the user)
assistant (response generated by the LLM as a response to the user message)

LLMFlows provides a MessageHistory class to manage the required conversation history for chat LLMs.

You can create the OpenAIChat and MessageHistory objects in the following way:

from llmflows.llms import OpenAIChat, MessageHistory

chat_llm = OpenAIChat(api_key="<your-openai-api-key>")
message_history = MessageHistory()

After we define the OpenAIChat and MessageHistory objects, we can use them to build a simple chatbot assistant with a few lines of code:

while True:
    user_message = input("You:")
    message_history.add_user_message(user_message)

    llm_response, call_data, model_config = chat_llm.generate(
        message_history
    )
    message_history.add_ai_message(llm_response)

    print(f"LLM: {llm_response}")

You: Hey
LLM: Hello! How can I assist you today?
You: ...

In the snippet above, the user input is added as a user message to the message_history object via the add_user_message() method. The message_history object is then passed to the generate() method of the chat_llm instance, returning a string response, associated API call information, and the model configuration.

The resulting llm_response is then added to the message_history with the add_ai_message() method, building a history of the dialogue as you continue the loop.

Prompt Templates

LLMFlows' PromptTemplate class is another core abstraction in LLMFlows. It allows users to create strings with variables and, from there, dynamically populated text prompts.

We can create prompt templates by passing in a string. The variables within the string are defined with curly brackets:

from llmflows.prompts import PromptTemplate

title_template = PromptTemplate(
    "Write a title for a {style} song about {topic}."
)

Once a prompt template object is created, an actual prompt can be generated by providing the required variables. Let's imagine we want to use the template above to create a song title for a hip-hop song about friendship:

title_prompt = title_template.get_prompt(
    style="hip-hop", 
    topic="friendship"
)
print(title_prompt)

"Write a title for a hip-hop song about friendship"

The PromptTemplate class replaces the defined variables with the provided values, and the resulting prompt can then be used with an LLM:

llm = OpenAI()
song_title, _, _ = llm.generate(title_prompt)
print(song_title)

"True to the Crew"

Combining LLMs

So far, we've covered the OpenAI, OpenAIChat, MessageHistory, and PromptTemplate classes, and we saw how we can quickly build simple LLM applications that generate outputs based on dynamically created prompts.

Another common pattern when building LLM applications is using the output of an LLM as an input to another LLM. Imagine we want to generate a title for a song, then create lyrics based on the title and finally paraphrase the lyrics to a particular style.

Let's create the prompts for the three steps:

from llmflows.prompts import PromptTemplate

title_template = PromptTemplate(
    "What is a good title of a song about {topic}"
)
lyrics_template = PromptTemplate(
    "Write the lyrics for a song called {song_title}"
)
heavy_metal_template = PromptTemplate(
    "paraphrase the following lyrics in a heavy metal style: {lyrics}"
)

Now we can use these prompt templates to generate the title based on an initial input, and each LLM output can serve as input for the variables in the following prompt template.

from llmflows.llms import OpenAI

llm = OpenAI()

title_prompt = title_template.get_prompt(topic="friendship")
song_title, _, _ = llm.generate(title_prompt)

lyrics_prompt = lyrics_template.get_prompt(song_title=song_title)
song_lyrics, _, _ = llm.generate(lyrics_prompt)

heavy_metal_prompt = heavy_metal_template.get_prompt(lyrics=song_lyrics)
heavy_metal_lyrics, _, _ = llm.generate(heavy_metal_prompt)

Let's see what we managed to generate. For the first LLM call, we provided the topic manually, and we got the following title:

print("Song title:", song_title)

Song title: "Friendship Forever"

The song title was then passed as an argument for the {song_title} variable in the next template, and the resulting prompt was used to generate our song lyrics:

print("Song Lyrics:\n", song_lyrics)

Song Lyrics:

Verse 1:
It's been a long road, but we made it here
We've been through tough times, but we stayed strong through the years
We've been through the highs and the lows, but we never gave up
Friendship forever, through the good and the bad

Chorus:
Friendship forever, it will always last
Together we'll stand, no matter what the past
No mountain too high, no river too wide
Friendship forever, side by side

Verse 2:
We've been through the laughter and the tears
We've shared the joys and the fears
But no matter the challenge, we'll never give in
Friendship forever, it's a bond that will never break

Chorus:
Friendship forever, it will always last
Together we'll stand, no matter what the past
No mountain too high, no river too wide
Friendship forever, side by side

Bridge:
We'll be here for each other, through thick and thin
Our friendship will always remain strong within
No matter the distance, our bond will remain
Friendship forever, never fade away

Chorus:
Friendship forever, it will always last
Together we'll stand, no matter what the past
No mountain too high, no river too wide
Friendship forever, side by side

Finally, the generated song lyrics were passed as an argument to the {lyrics} variable of the last prompt template, which is used for the final LLM call that produces the heavy metal version of the lyrics:

print("Heavy Metal Lyrics:\n", heavy_metal_lyrics)

Heavy Metal Lyrics:

Verse 1:
The journey was hard, but we made it here
Through the hardships we endured, never wavering in our hearts
We've seen the highs and the lows, but never surrendering
Friendship forever, no matter the odds

Chorus:
Friendship forever, it will never die
Together we'll fight, no matter what we defy
No force too strong, no abyss too deep
Friendship forever, bound in steel we'll keep

LLM Flows

So far, we reviewed the LLM, MessageHistory, and PromptTemplateabstractions and introduced two common patterns when building LLM-powered apps. The first pattern was using prompt templates to create dynamic prompts, and the second one was using the output of an LLM as input to another LLM.

In this section, we will introduce two new main abstractions of LLMFlows - FlowStep and Flow.

Flows and FlowSteps are the bread and butter of LLMFlows. They are simple but powerful abstractions that serve as the foundation for constructing Directed Acyclic Graphs (DAGs), where each FlowStep represents a node that calls a LLM. While these abstractions are designed to be simple and intuitive, they offer robust capabilities for managing dependencies, sequencing execution, and handling prompt variables.

Let's try to reproduce the previous example using Flows and Flowsteps. As a start, let's define the same prompt templates:

from llmflows.prompts import PromptTemplate

title_template = PromptTemplate(
    "What is a good title of a song about {topic}"
)
lyrics_template = PromptTemplate(
    "Write the lyrics for a song called {song_title}"
)
heavy_metal_template = PromptTemplate(
    "paraphrase the following lyrics in a heavy metal style: {lyrics}"
)

Once we have the prompt templates, we can start defining the flow steps. To create a flow step, we have to provide the required parameters for the FlowStep class:

flow step name (must be unique)
the LLM to be used within the flow step
the prompt template to be used when calling the LLM
output key (must be unique) - a variable that stores the result of the flow step and can be used as a prompt variable for other flow steps

from llmflows.flows import Flow, FlowStep

title_flowstep = FlowStep(
    name="Title Flowstep",
    llm=OpenAI(),
    prompt_template=title_template,
    output_key="song_title",
)

lyrics_flowstep = FlowStep(
    name="Lyrics Flowstep",
    llm=OpenAI(),
    prompt_template=lyrics_template,
    output_key="lyrics",
)

heavy_metal_flowstep = FlowStep(
    name="Heavy Metal Flowstep",
    llm=OpenAI(),
    prompt_template=heavy_metal_template,
    output_key="heavy_metal_lyrics",
)

Once we have the FlowStep definitions, we can connect the flow steps:

title_flowstep.connect(lyrics_flowstep)
lyrics_flowstep.connect(heavy_metal_flowstep)

After creating the flow step graph, we can create the flow by providing the first flow step.

Finally, we can start it by using its start() method and passing any required initial inputs. In this case, we need to pass the topic variable used in the first flow step.

songwriting_flow = Flow(title_flowstep)
result = songwriting_flow.start(topic="love", verbose=True)

This is it!

Although this might seem like a lot of extra abstractions to achieve the same functionality as in the previous example, if we start inspecting the results, we will see some advantages of using Flows and FlowSteps.

After running all flow steps, the flow will return detailed results for each individual step:

print(result)

{
    "Title Flowstep": {...},
    "Lyrics Flowstep": {...},
    "Heavy Metal Flowstep": {...}
}

{
   "start_time":"2023-07-03T15:23:47.490368",
   "prompt_inputs":{
      "topic":"love"
   },
   "generated":"Love Is All Around Us",
   "call_data":{
      "raw_outputs":{
         "<OpenAIObject text_completion id=cmpl-7YMFPac1MKUje0jIyk4adkYssk4rQ at 0x107946f90> JSON":{
            "choices":[
               {
                  "finish_reason":"stop",
                  "index":0,
                  "logprobs":null,
                  "text":"Love Is All Around Us"
               }
            ],
            "created":1688423027,
            "id":"cmpl-7YMFPac1MKUje0jIyk4adkYssk4rQ",
            "model":"text-davinci-003",
            "object":"text_completion",
            "usage":{
               "completion_tokens":9,
               "prompt_tokens":10,
               "total_tokens":19
            }
         }
      },
      "retries":0,
      "prompt_template":"What is a good title of a song about {topic}",
      "prompt":"What is a good title of a song about love"
   },
   "config":{
      "model_name":"text-davinci-003",
      "temperature":0.7,
      "max_tokens":500
   },
   "end_time":"2023-07-03T15:23:48.845386",
   "execution_time":1.355005416,
   "result":{
      "song_title":"Love Is All Around Us"
   }
}

There is a lot to unpack here, but after finishing the flow, we have complete visibility of what happened at each flow step. By having this information, we can answer questions such as:

When was a particular flowstep run?
How much time did it take?
What were the input variables?
What was the prompt template?
What did the prompt look like?
What was the exact configuration of the model?
How many times did we retry the request?
What was the raw data the API returned?
How many tokens were used?
What was the final result?

This ties to our "Simple, Explicit, and Transparent LLM apps philosophy." The flow steps are defined explicitly, without any hidden prompts and logic, and the information they return gives developers complete visibility and the ability to log, debug, and maintain LLM apps easily.

Growing Complexity

These capabilities become increasingly helpful as applications grow in complexity.

For example, let's imagine that, for some reason, you want to create an app that can generate a movie title, a movie song title based on the movie title, write a summary for the two main characters of the movie and finally create song lyrics based on a movie title, song title, and the two characters.

Here is a visual representation of the flow:

Let’s see how we can build this. We can start by defining the prompts that we need:

from llmflows.prompts import PromptTemplate

title_template = PromptTemplate("What is a good title of a movie about {topic}?"
)

song_template = PromptTemplate(
    "What is a good song title of a soundtrack for a movie called {movie_title}?"
)

characters_template = PromptTemplate(
    "What are two main characters for a movie called {movie_title}?"
)

lyrics_template = PromptTemplate(
    "Write lyrics of a movie song called {song_title}. The main characters are {main_characters}"
)

Afterwards, we can create the four flowsteps:

from llmflows.flows import Flow, FlowStep
from llmflows.llms import OpenAI

openai_llm = OpenAI(api_key="<your-api-key>")

# Create flowsteps
movie_title_flowstep = FlowStep(
    name="Movie Title Flowstep",
    llm=openai_llm,
    prompt_template=title_template,
    output_key="movie_title",
)

song_title_flowstep = FlowStep(
    name="Song Title Flowstep",
    llm=openai_llm,
    prompt_template=song_template,
    output_key="song_title",
)

characters_flowstep = FlowStep(
    name="Characters Flowstep",
    llm=openai_llm,
    prompt_template=characters_template,
    output_key="main_characters",
)

song_lyrics_flowstep = FlowStep(
    name="Lyrics Flowstep",
    llm=openai_llm,
    prompt_template=lyrics_template,
    output_key="song_lyrics",
)

Once we have defined the flow steps we can connect them to create the DAG from the figure above:

movie_title_flowstep.connect(
    song_title_flowstep, 
    characters_flowstep, 
    song_lyrics_flowstep
)
song_title_flowstep.connect(song_lyrics_flowstep)
characters_flowstep.connect(song_lyrics_flowstep)

Finally, we can create the Flow object and start the flow:

soundtrack_flow = Flow(flowstep1)
results = soundtrack_flow.start(topic="friendship", verbose=True)

And Voilà! We created a complex flow as easily as the basic example before that. LLMFlows knows the dependencies and will run the flow steps in the correct order, ensuring all the inputs are available before running a given step.

In fact, you might have already noticed that there is an emerging pattern:

Define how the flow should look like
Create the prompt templates
Create the flow steps
Connect the flow steps to define the DAG
Start the flow

By following this pattern, you can create any flow with any level of complexity as long as you can represent it as a DAG.

Conclusion

LLMFlows aims to simplify and speed up the process of building LLM apps while making the apps more explicit and transparent.

In this introductory post, we went through some of the main abstractions of LLMFlows - LLMs, PromptTemplates, FlowSteps, and Flows and we saw how we could use them to easily create LLM apps with complex prompt dependencies.

In our upcoming posts, we will go over topics like how to make parallel async calls to speed up our applications, how to use vector databases to talk to our documents, we will build web apps, and even create our own agents that can use different tools.

How can you help?

If you like the project please consider starring the repository, sharing it with friends or on social media.

If you've tried LLMFlows and have some issues, feedback, or ideas, feel free to open an issue or reach out!

If you find LLMFlows exciting and considering contributing, please check our contributing section in GitHub.

Meanwhile, if you want to learn more before our next post, feel free to check out the user guide.Thank you for taking the time to read this post!

Top comments (1)

Shreyansh Jain • Jan 22 '24

Awesome.. I went through the GitHub repo, looks amazing to me.