Despite OpenAI GPTs, and their marketplace, are not really ready for widespread adoption, they present a unique opportunity to create powerful, AI-powered applications tailored to individual needs.
Creating a domain-specific tool using GPTs doesn't require complex programming skills or deep AI knowledge. The concept revolves around leveraging basic techniques to craft tools that may significantly enhance your work efficiency and productivity.
In this article, I want to show a concrete example of the steps needed to add domain knowledge to these tools and how easy is to do it.
Utilizing the Knowledge Base
The most flexible way to provide specific capabilities to GPTs is through defining "Actions" (calls to unrestricted APIs) that can offer dynamic information retrieval and interaction with other systems.
For many applications, however, the extensive capabilities provided by Actions might be overkill and not worth the hassle of setting up servers, dealing with security and authorizations, worrying about network bandwidth, and so on.
In this article, we'll use solely GPTs's "Knowledge Base" as a simpler, yet incredibly effective, alternative. By uploading a few files to the knowledge base, users can significantly enhance GPTs's ability to generate relevant, context-aware responses. Although there are limits to the number of files and their sizes, most use cases won't bump into these constraints.
You need to have the *Code Interpreter" enabled in your GPTs to have everything working:
Case Study: The Creative Bartender
To illustrate how we can do it, let's consider the creation of a tool to support a creative bartender. This AI-powered bartender has access to the book "Drinks of all Kinds - HOT AND COLD - FOR ALL SEASONS", published in 1896, containing an extensive collection of drink recipes and is tasked to generate new recipes based on user input (for example, a set of ingredients or the name of the drink to create).
Please don't try those drinks at home! Their taste and healtiness is, to say the least, dubious!
Bartender GPTs architecture
To better understand how these types of GPTs are structured, let's look at the overall architecture:
To construct our Bartender tool, we start with structuring its knowledge. Nothing good will come by just throwing a bunch of PDFs at it. It's like teaching to humans: one thing is to provide relevant, and structured, material to learn from, and another one is to just tell them to read hundreds of pages hoping they'll pick up what we intended them to learn.
Let's organize the knowledge this way:
- Keywords.txt: This file contains indexed keywords, acting as a Retrieval-Augmented Generation (RAG) component, providing the Bartender's ability to suggest recipes that are related to the user input. This is the static knowledge of our AI: just a set of keywords and the information in which recipes they are mentioned:
....
egg is 13.
gin is 14.
japanese is 15.
jersey is 16.
jersey is 17.
crystal is 17.
ice is 17.
....
The prompt will instruct the LLM to use this static knowledge to identify keywords in user's input and determine the recipes that are related to these keywords (identified by a number).
recipes.py
: This script handles the extraction from the recipes database of those recipes identified by the LLM as related to the user's input. If there are less than two recipes identified, it will select random recipes to provide enough material to create novel concoctions. In our example the task is simple but you can imagine to have a script that performs more complicated tasks.recipes.db
andoffsets.ndx
: The actual database with recipes data extracted from the book. It could have been a SQL database (for example, using sqlite) or any other suitable format.prompts.md
: Everything is governed by the prompt which is reported here verbatim:
## Role
You are a creative bartender whose task is to create new recipes
based on existing recipes.
## Instructions
- Based on your knowledge, choose words that are related to the
user prompt and take their associated {numbers}
- Execute the `recipes.py` script passing as arguments the {numbers}:
`%run /mnt/data/recipes.py {number} {number} ...`
- Use the information in the {title} and {recipe} to create a new
recipe.
## Constraints
- Do not use any other information than the recipes
- Write the description of the recipe in a discoursive style,
do not use lists or bullet point
- Use the same writing style as the original recipes
The directions given in the Instructions and Constraints sections work together to ensure that the LLM will stay within the boundary of its knowledge.
The way to execute the script is also explicitly mentioned to avoid the LLM, hallucinating, would create his own Python script (it tried to do so a couple of times before I included this specific instruction).
This is an example of the activation of recipes.py
on the prompt
Make an "Imperial Dog Apple Drink"
Note how the output of the script is structured as two quoted strings, one labeled title
and the other labeled recipe
. These labels are referred to in the prompt as {title}
and {recipe}
to guide the final LLM step and make it produce a text based on the script output.
The AI/UI Pattern
This Bartender example also shows a common pattern in AI tool design, which I call AI/UI, where an LLM is used to process user input and determine the actual parameters to be passed to a program, and it is used again to humanize the program's output.
This is one of the most common uses for LLMs: being a human-friendly User Interface for programs.
Conclusion
This "Creative Bartender" example highlights the simplicity and the potential of using GPTs to create domain-specific AI-powered tools.
As GPTs, or similar developers-friendly platforms, evolve making it easier to deploy such custom tools, we can anticipate a rapid growth of innovative AI-powered applications. To be ready for that time, now is the right time to experiment, exploring how AI can enhance our efficiency and creativity across various domains.
If you have access to GPTs marketplace, you can directly interact with the "Creative Bartender".
The full code and text files are available on Github.
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.