DEV Community

Jethro Larson
Jethro Larson

Posted on

Streaming ChatGPT API responses with python and JavaScript

Image description

It took me a while to figure out how to get a python flask server and web client to support streaming OpenAI completions so I figured I'd share.

from flask import Flask, stream_template, request, Response
import openai
from dotenv import load_dotenv
import os

# put these values in an .env file parallel to this file
openai.organization = os.environ.get("OPENAI_ORG")
openai.api_key = os.environ.get('OPENAI_API_KEY')
def send_messages(messages):
    return openai.ChatCompletion.create(

app = Flask(__name__)

@app.route('/chat', methods=['GET', 'POST'])
def chat():
    if request.method == 'POST':
        messages = request.json['messages']
        def event_stream():
            for line in send_messages(messages=messages):
                text = line.choices[0].delta.get('content', '')
                if len(text): 
                    yield text

        return Response(event_stream(), mimetype='text/event-stream')
        return stream_template('./chat.html')

if __name__ == '__main__':

Enter fullscreen mode Exit fullscreen mode


<!DOCTYPE html>
    <form id="chat-form">
      <label for="message">Message:</label>
      <input type="text" id="message" name="message">
      <button type="submit">Send</button>
    <div id="chat-log"></div>
    <script src="{{ url_for('static', filename='chat.js') }}">
Enter fullscreen mode Exit fullscreen mode

You can't use EventSource for this if you want to use POST method, this uses fetch API instead.


const form = document.querySelector("#chat-form");
const chatlog = document.querySelector("#chat-log");

form.addEventListener("submit", async (event) => {

  // Get the user's message from the form
  const message = form.elements.message.value;

  // Send a request to the Flask server with the user's message
  const response = await fetch("/chat", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
    body: JSON.stringify({ messages: [{ role: "user", content: message }] }),

  // Create a new TextDecoder to decode the streamed response text
  const decoder = new TextDecoder();

  // Set up a new ReadableStream to read the response body
  const reader = response.body.getReader();
  let chunks = "";

  // Read the response stream as chunks and append them to the chat log
  while (true) {
    const { done, value } = await;
    if (done) break;
    chunks += decoder.decode(value);
    chatlog.innerHTML = chunks;
Enter fullscreen mode Exit fullscreen mode

Obviously this is not an optimal chat user experience but it'll get you started.

Top comments (2)

mrcm8 profile image
mercm8 • Edited

I tried this and it worked great running on localhost, but when I tried deploying it to my makeshift webserver (rpi / nginx) it stopped streaming and waited for the response stream to finish before the message appeared. Any idea why?

edit: I needed to add 'X-Accel-Buffering' = 'no' to response headers, changing the code to
response = Response(event_stream(), mimetype='text/event-stream')
response.headers['X-Accel-Buffering'] = 'no'
return response

brayden1moore profile image
Brayden Moore

Exactly what I was looking for. Also dig the way you write code. E.g. one app route with an if/else rather than one for POSTs and another just to display the template. Nice.