Faiq Ahsan

Posted on May 3 • Updated on May 6

Build your own AI ChatBot on your machine

#ai #chatgpt #machinelearning #tutorial

By now everyone knows and love ChatGPT and GenAI has taken the world by storm but do you know now you can build and run your own custom AI chatbot on your machine.

YES! Let's take a look at the ingredients for this recipe.

Python

If you are someone who is looking to dig deep into the AI / ML then you need to learn Python which is the go to programming language in this space. If you already know it then you are all set here otherwise i would suggest going through a python crash course or whatever suits you best and also make sure that you have python3 installed on your system.

Ollama

Ollama is an awesome open source package which provides a really handy and easy way to run the large language models locally. We would be using this package to download and run the 8B version of Llama3.

Gradio

Gradio is the fastest way to demo your machine learning model with a friendly web interface so that anyone can use it.

Ok so now lets start!!

Step1: Installing the Ollama

Download and install the Ollama package on your machine. Once installed run the below command to pull the Llama3 8B version.

ollama pull llama3

By default it downloads the 8B version if you want to run other version like 70B then simply append it after the name e.g llama3:70b. Check out the complete list here.

Note: You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.

Step2: Creating custom model from Llama3

Open up a code editor and create a file name modelfile and paste the below content in it.

FROM llama3

## Set the Temperature

PARAMETER temperature 1

PARAMETER top_p 0.5

PARAMETER top_k 10

PARAMETER mirostat_tau 4.0

## Set the system prompt

SYSTEM """
You are a personal AI assistant named as Ultron created by Tony Stark. Answer and help around all the questions being asked.
"""

Parameters

Parameters dictates how your model responds and learn.

temperature: The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)

top_p: Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)

top_k: Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)

mirostat_tau: Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)

Check out all the available parameters and their purpose here.

System prompt

Here you can play around and give any name and personality to your chatbot.

Now let's create the custom model from the modelfile by running the below command. Provide a name of your choice e.g ultron.

ollama create ultron -f ./Modelfile
ollama run ultron

You would see ultron running and ready to accept input prompt. Ollama has also REST API for running and managing the model so when you run your model it's also available for use on the below endpoint

http://localhost:11434/api/generate

We will be using this api to integrate with our Gradio chatbot UI.

Step3: Create the UI for chatbot

Initialize a python virtual environment by running the below commands.

python3 -m venv env
source env/bin/activate

Now install the required packages

pip install requests gradio

Now create a python file app.py and paste the below code in it.

import requests
import json
import gradio as gr

model_api = "http://localhost:11434/api/generate"

headers = {"Content-Type": "application/json"}

history = []


def generate_response(prompt):
    history.append(prompt)
    final_prompt = "\n".join(history)  # append history
    data = {
        "model": "ultron",
        "prompt": final_prompt,
        "stream": False,
    }
    response = requests.post(model_api, headers=headers, data=json.dumps(data))
    if response.status_code == 200:  # successful
        response = response.text
        data = json.loads(response)
        actual_response = data["response"]
        return actual_response
    else:
        print("error:", response.text)


interface = gr.Interface(
    title="Ultron: Your personal assistant",
    fn=generate_response,
    inputs=gr.Textbox(lines=4, placeholder="How can i help you today?"),
    outputs="text",
)
interface.launch(share=True)

Now let's launch the app, run your python file python3 app.py and your chatbot would be live on the below endpoint or similar. Please note that the response time may vary according to the your system's computing power.

http://127.0.0.1:7860/

There you have it! Your own chatbot running locally on your machine, you can even turn off the internet it would still work. Please share in the comments what other cools apps you are making with AI models.

DEV Community

Build your own AI ChatBot on your machine

Python

Ollama

Gradio

Step1: Installing the Ollama

Step2: Creating custom model from Llama3

Parameters

System prompt

Step3: Create the UI for chatbot

Top comments (0)

Read next

Structured: Extract Data from Unstructured Input with LLM

Is Programming Safe from AI of the Future? AI’s Impact on Programming

LangChain: LLM App Evaluation

Building a Personalized Chatbot with a 3D Twist: A Developer's Journey