After explaining in my previous article how to create a ChatBot with LibreChat and VertexAI, and delving into the final part of my series on How open in Generative AI?, I feel compelled to share this concise tutorial on setting up a Chatbot using only open-source components, including the model.
Hugging Face is an ideal starting point when considering open source models.
Introducing Hugging Face
Hugging Face is an open-source AI startup that focuses on developing and providing state-of-the-art natural language processing (NLP) models and APIs for various applications. Its primary goal is to make NLP more accessible and user-friendly for developers, researchers, and businesses by offering pre-trained models, libraries, and easy-to-use interfaces.
The organization shares several open source models and libraries, in addition to offering cloud-based API services through their Model Hub. This allows users to deploy and use pre-trained models without worrying about infrastructure or deployment issues.
It functions as a collaborative platform where the AI community can share and reuse models, datasets, and code, resembling the "GitHub for AI."
Let's start by creating a free account and obtaining an associated access token to use Hugging Face APIs.
Selecting the model
So let's select a model to use for our ChatBot by consulting the Model Hub...
When selecting a model for our Chatbot from the Model Hub... Well, there are almost 400,000 models to choose from! This is a testament to the dynamism of the AI community.
I chose Open-Assistant SFT-4 12B because it's a well-known fine-tuned model, based on a foundational model by EleutherAI, licensed under Apache 2.0, and I appreciate the crowdsourcing aspect of the mainstream project.
Accessing the Model with a simple API Call
Before integrating the Chatbot UI, we should ensure we can access the model with a simple Node.js API call. Hugging Face conveniently provides the code after selecting the Inference API in the Deploy menu.
We choose the Javascript API, enable the Show API Token option, and copy the provided code.
async function query(data) {
var response = await fetch("https://api-inference.huggingface.co/models/OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5",
{
headers: { Authorization: "Bearer <your-token-here>", "Content-type": "application/json" },
method: "POST",
body: JSON.stringify(data),
}
);
var result = await response.json();
return result;
}
var input = {"inputs": "What is Kriya Yoga?"};
query(input).then((response) => {
console.log(JSON.stringify(response));
});
I added the "Content-Type": "application/json"
header to properly manage the response. Here's the result when executed by Node.js:
[{"generated_text":"What is Kriya Yoga?\n\nKriya Yoga is a spiritual practice that involves a set of techniques designed to help individuals"}]
The API and the model responded to my query by completing the text I provided. Note that the response is truncated and may require multiple calls to complete, but it suffices for our test. Now, let's focus on the UI.
Chat UI
Hugging Face also offers the HuggingChat application, allowing anyone to interact with some of the community's models.
The source code is available under the Apache 2.0 license on GitHub. It's a Svelte application that also uses a MongoDB database to store chat history.
Let' install and configure it:
git clone https://github.com/huggingface/chat-ui.git
cd chat-ui
We then start a MongoDB database in a Docker container:
docker run -d -p 27017:27017 --name mongo-chatui mongo:latest
Next, we configure our API key and the MongoDB URL in a .env
file:
vi .env
MONGODB_URL=mongodb://localhost:27017
HF_ACCESS_TOKEN=<your-token-here>
After installing Node.js dependencies, we start the application:
npm install
npm run dev
The Chatbot is now up and running, accessible at http://localhost:5173/.
We notice the default model in Chat UI is OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 (that might be the real reason why I selected it earlier...). The conversation history is stored, and responses are not truncated. A dark theme is available as well.
Further customization is possible by modifying the .env.local
file to change the model and other settings. For instance:
MONGODB_URL=mongodb://localhost:27017
HF_ACCESS_TOKEN=<your-token-here>
PUBLIC_ANNOUNCEMENT_BANNERS=
PUBLIC_APP_NAME=My ChatBot
PUBLIC_APP_COLOR=emerald
MODELS=`[
{
"name": "HuggingFaceH4/zephyr-7b-beta",
"datasetName": "HuggingFaceH4/ultrachat",
"description": "A 7B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets",
"websiteUrl": "https://huggingface.co/HuggingFaceH4/zephyr-7b-beta",
"userMessageToken": "<|prompter|>",
"assistantMessageToken": "<|assistant|>",
"messageEndToken": "</s>",
"preprompt": "Below are a series of dialogues between various people and an AI assistant. The AI tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble-but-knowledgeable. The assistant is happy to help with almost anything, and will do its best to understand exactly what is needed. It also tries to avoid giving false or misleading information, and it caveats when it isn't entirely sure about the right answer. That said, the assistant is practical and really does its best, and doesn't let caution get too much in the way of being useful.\n-----\n",
"promptExamples": [
{
"title": "Write an email from bullet list",
"prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
}, {
"title": "Code a snake game",
"prompt": "Code a basic snake game in python, give explanations for each step."
}, {
"title": "Assist in a task",
"prompt": "How do I make a delicious lemon cheesecake?"
}
],
"parameters": {
"temperature": 0.9,
"top_p": 0.95,
"repetition_penalty": 1.2,
"top_k": 50,
"truncate": 1000,
"max_new_tokens": 1024
}
}
]`
With these changes, we have now personalized our ChatBot's model and behavior. This time a chose the brand new zephyr-7b-beta model released by Hugging Face.
Let's stop and restart our Web application.
We now use the Zephyr model and the previous chat history is still available. Furthermore you have an explanation of the DPO method used to train the Zephyr-7B-β model.
For production I would recommended to deploy the models on dedicated and paid Inference Endpoints, hosted on an infrastructure fully managed by Hugging Face.
Final Thoughts
Building a fully open source ChatBot with Hugging Face components is not only feasible but also a testament to the vibrant open source community and accessible AI technology. By leveraging models from the Model Hub and using the Chat UI, anyone can create a sophisticated and customizable ChatBot.
The future is indeed open, and it's fascinating to see how these tools democratize access to advanced NLP capabilities. The open source DNA of sharing and collaboration is what drives innovation forward, and Hugging Face is at the forefront of this movement.
Feel free to experiment with different models and configurations to suit your specific needs. The possibilities are vast, and the only limit is your imagination.
Stay tuned, and keep exploring the exciting world of Generative AI!
Top comments (1)
Developing a ChatBot with Hugging Face is amazing! Thanks for the examples - they were clear and concise and I would love to add a few insights.
The Transformers library is a treasure trove for anyone delving into tasks like text classification, question-answering, summarization, translation, and beyond. Hugging Face doesn’t just supply the tools; it offers the means to innovate and push the boundaries of what’s possible in NLP.
There are three key features Hugging Face offers that simplify the process of working with ML data: Datasets, Models, and Spaces.
Pre-trained models can be used to perform many different tasks, such as:
So for anyone learning how to use Hugging Face, I recommend this article from my partner Nicolas Azevedo, which provides some good examples of Hugging Face: scalablepath.com/machine-learning/... it also provides a nice example of Hugging Face used in the e-commerce industry.