TLDR;
Intro
Artificial intelligence and natural language processing have made significant strides in recent years. Large language models (LLMs) like ChatGPT and GPT-4 have demonstrated remarkable capabilities in generating human-like text. However, controlling the output of these models can be challenging, especially when ensuring the generated text meets specific requirements or follows a desired format.
Keymaker, a Python library, provides a powerful, flexible, and extensible way to control the output of large language models. Keymaker makes it easier than ever to ensure your model's output is exactly what you need. It offers a simple and straightforward way to create and apply constraints on generated tokens.
Example TLDR;
The example below demonstrates several powerful features of Keymaker:
Dynamic Prompts and Responses: Keymaker allows for dynamic generation of prompts and responses using a simple formatting and completion system.
Model Flexibility: You can use different models for different parts of the prompt.
chatgpt
andLlamaCpp
are used at will.Constraints: Constraints like
OptionsConstraint
,StopsConstraint
, andRegexConstraint
(and others not shown) give you the power to control the output of your model precisely.Mapping Function: The mapping function allows for transformation of the generated output before it is returned.
Multiple Completions: You can generate multiple fully controlled completions for a single prompt, as demonstrated in the countdown example.
Plain Python: Everything is plain python.
Prompt
s arestr
and control-flow is plain and testable python. Control flow and other logic is not tucked away in a string or DSL.
Now, let's get into it.
Keymaker in Practice
First, import the necessary modules and set up the model instances:
from keymaker.models import chatgpt, LlamaCpp
from keymaker import Prompt, CompletionConfig
from keymaker.constraints import RegexConstraint, OptionsConstraint, StopsConstraint
chat_model = chatgpt()
llama_model = LlamaCpp(model_path="path/to/llama/model/file")
Create the prompt with format parameters. The placeholders in the prompt are for various completions that will be generated using different models and constraints.
async def print_stream(completion):
print(completion)
prompt = Prompt(
"""Time: {time}
User: {user_msg}
Assistant: Hello, {}{punctuation}
User: Can you write me a poem about a superhero named pandaman being a friend to {}?
Assistant:{poem}
User: What is 10+5?
Assistant: The answer is 10+5={math}
The final answer is {fin}!
User: Countdown from 5 to 0.
Assistant: 5, 4, {countdown}
""",
chat_model,
stream = print_stream,
)
Generate completions from different models by using the format
method on the Prompt
object. Different models and constraints are used for each completion:
filled_in = await prompt.format(
CompletionConfig(constraint=OptionsConstraint({"Sam", "Nick"})),
lambda p: p.completions[0],
punctuation="!",
user_msg="Hi, my name is Nick.",
time="2023-07-23 19:33:01",
poem=CompletionConfig(
llama_model,
max_tokens=250,
constraint=StopsConstraint("User|Assistant", include=False),
),
math=CompletionConfig(
llama_model,
constraint=RegexConstraint("[0-9]+", terminate_on_match=False),
map_fn=int,
),
fin=lambda p: CompletionConfig(
llama_model,
constraint=RegexConstraint(rf"{p.completions.math}|16"),
),
countdown=lambda p: (
CompletionConfig(
llama_model,
constraint=RegexConstraint("[0-9]"),
map_fn=lambda s: f"{s}, ",
)
for _ in range(5)
),
)
Print the resulting completed prompt to see the generated completions:
print(filled_in)
This example demonstrates how to generate completions from different models like chatgpt
and LlamaCpp
using the format
method in Keymaker. The prompt contains placeholders for various completions, and they are generated using different models and constraints.
With Keymaker installed, controlling the output of large language models becomes easier than ever. Check out the documentation โ for more information and examples on how to use Keymaker effectively.
In Depth Explanation
Let's break down the example to understand each portion of the code.
0. The Prompt
Take another look at the Prompt
async def print_stream(completion):
print(completion)
prompt = Prompt(
"""Time: {time}
User: {user_msg}
Assistant: Hello, {}{punctuation}
User: Can you write me a poem about a superhero named pandaman being a friend to {}?
Assistant:{poem}
User: What is 10+5?
Assistant: The answer is 10+5={math}
The final answer is {fin}!
User: Countdown from 5 to 0.
Assistant: 5, 4, {countdown}
""",
chat_model,
stream = print_stream,
)
Prompts are just strings and Keymaker intends that to be the case with completions as well. The format
is another example of that. any {...}
is a spot to insert text just as with an ordinary format string.
At the end of the prompt, we have an opportunity to introduce default parameters for all completions. Here, we introduce a default model chat_model
and a simple stream function that simply prints completions as they occur.
1. CompletionConfig with OptionsConstraint
We start by generating a completion using an OptionsConstraint
. This constraint allows us to restrict the generated text to one of the provided options. In this case, the options are "Sam" and "Nick".
CompletionConfig(constraint=OptionsConstraint({"Sam", "Nick"}))
2. Completion Using a Lambda Function
Next, we use a lambda function to reference the previous completion. This allows us to dynamically include the value generated by the previous completion in our prompt.
lambda p: p.completions[0]
3. Static and Dynamic Constraints
We use both static and dynamic constraints to control the generated text. format
completion is flexible in that we can provide
-
Stringable
: Something castable to astr
-
Callable
returningStringable
orCompletionConfig
: Dynamic single component prompt -
Callable
returning iterable ofStringable
orCompletionConfig
: Dynamic multi-component prompt
The Callable options allow the prompt to be customized dynamically based on the context. The CompletionConfig return allows configuring the completions directly in the prompt formatter.
The StopsConstraint
means generation ends once "User" or "Assistant" is generated (or of course a stop token is reached). The RegexConstraint
ensures the generated text matches a regular expression pattern.
poem=CompletionConfig(
llama_model,
max_tokens=250,
constraint=StopsConstraint("User|Assistant", include=False),
),
math=CompletionConfig(
llama_model,
constraint=RegexConstraint("[0-9]+", terminate_on_match=False),
map_fn=int",
),
fin=lambda p: CompletionConfig(
llama_model,
constraint=RegexConstraint(rf"{p.completions.math}|16"),
),
4. Mapping Function
We use a mapping function to modify the generated completion that is simply a string cast. When we later access the completion with filled_in.completions.math.value
we will find it is an int
.
map_fn=int,
5. Generating Multiple Completions
Finally, we generate multiple completions for the "countdown" prompt by using a Generator
and creating a separate CompletionConfig
object for each completion.
# p is the prompt passed to your function by Keymaker
countdown=lambda p: (
CompletionConfig(
llama_model,
constraint=RegexConstraint("[0-9]"),
map_fn=lambda s: f"{s}, ",
)
for _ in range(5)
),
At any point we could return a Stringable
or stop completion based on the value of p: Prompt
.
By using Keymaker's powerful and flexible features, we can control the output of our large language models with ease and precision.
Top comments (0)