Creating a Generative AI Chatbot using GPT-2 and Flask

Generative chatbots are revolutionizing human-computer interaction by allowing machines to not only respond to predefined commands but also generate contextually relevant and coherent responses. This technological advancement has found applications in a variety of sectors, from online customer service to virtual assistance on educational and business platforms. The ability of chatbots to understand natural language and generate near-human responses is transforming the way we interact with technology, offering scalable and efficient solutions for automated communication

Chatbot implementation in Flask

Initial configuration:

In this snippet, the necessary libraries are imported and the GPT-2 model and tokenizer are loaded.


from flask import Flask, request, jsonify
from transformers import GPT2LMHeadModel, GPT2Tokenizer

print("Starting application...")

app = Flask(__name__)

# Cargar el modelo y el tokenizer
print("Loading model and tokenizer...")
tokenizer = GPT2Tokenizer.from_pretrained("distilgpt2")
model = GPT2LMHeadModel.from_pretrained("distilgpt2")

print("Model and tokenizer loaded.")

Flask and other libraries are imported to configure the web server.
GPT2LMHeadModel and GPT2Tokenizer are imported from transformers to load the GPT-2 model and its tokenizer.
A Flask instance (app) is initialized to create the web application.

Function to generate responses:

This snippet defines the generate_response function that uses the GPT-2 model to generate responses based on a given prompt.

def generate_response(prompt):
    print(f"Generating response for prompt: {prompt}")
    inputs = tokenizer.encode(prompt, return_tensors="pt")
    outputs = model.generate(
        inputs, 
        max_length=50,  
        num_beams=5,    
        early_stopping=True,
        no_repeat_ngram_size=2,  
        temperature=0.7,  
        top_k=50,         
        top_p=0.95        
    )
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print(f"Generated response: {response}")
    return response

generate_response takes a prompt as input and encodes it using the tokenizer.
model.generate generates a response based on the prompt using parameters such as max_length to limit the length, num_beams to improve quality, temperature to control creativity, among others.
The result is decoded using the tokenizer to obtain the final response. Flask API for the chatbot:

In this snippet, we configure a POST /chatbot route that accepts JSON requests with a prompt, generates a response using the generate_response function, and returns the generated response as JSON.

@app.route('/chatbot', methods=['POST'])
def chatbot():
    try:
        print("Received request.")
        body = request.get_json()
        prompt = body.get('prompt', '')
        response = generate_response(prompt)
        return jsonify({'response': response})
    except Exception as e:
        print(f"Error: {str(e)}")
        return jsonify({'message': 'Internal server error', 'error': str(e)}), 500

if __name__ == '__main__':
    print("Starting Flask server...")
    app.run(debug=True)

@app.route('/chatbot', methods=['POST']) defines a /chatbot route that accepts POST methods to receive JSON requests.
chatbot() is the function that handles POST requests, gets the JSON request body prompt, generates a response using generate_response, and returns the generated response as JSON.
In case of errors, they are captured and handled by returning an error message with status code 500.

Testing with CURL

First question:

curl -X POST http://127.0.0.1:5000/chatbot -H "Content-Type: application/json" -d "{\"prompt\": \"Can you tell me about the weather today?\"}"

Second question:

curl -X POST http://127.0.0.1:5000/chatbot -H "Content-Type: application/json" -d "{\"prompt\": \"hello?\"}""

Third question:

curl -X POST http://127.0.0.1:5000/chatbot -H "Content-Type: application/json" -d "{\"prompt\": \"how are you?\"}"

Next Steps:

We should always try to deploy this type of projects in the cloud, whether AWS or Azure, in this case I leave an illustrative diagram of what the deployment would look like using AWS lambda, AWS Api Gateway and AWS S3.

Conclusion:

The combination of GPT-2 and Flask allows the development of chatbots for educational purposes, capable of dealing naturally and effectively with users. These technological advances are rapidly transforming how we think about human-computer interaction, offering scalable and efficient solutions for a variety of applications, from customer service to AI-assisted education, as long as we use the most advanced models such as GPT-4 or GPT. -4o. As we continue to explore and refine these tools, it's clear that generative chatbots represent a significant step toward smarter, more adaptive systems in the digital future.