DEV Community

Cover image for Building AI-Powered Apps with SvelteKit: Managing HTTP Streams from Ollama Server
Roberto B.
Roberto B.

Posted on

Building AI-Powered Apps with SvelteKit: Managing HTTP Streams from Ollama Server

Svelte 5 has taken modern web development by storm with its elegant, declarative approach to building user interfaces. Paired with Ollama, a lightweight model server designed for AI-driven applications, Svelte unlocks seamless integration of HTTP streams into dynamic web apps. This article demonstrates how you can use Svelte's reactivity alongside Ollama’s API to create an interactive real-time application for generating AI responses.

Let’s dive into the example code above and dissect the essential parts.

Before diving into the technical integration of SvelteKit and Ollama, let’s cover the initial setup process. If you’re new to either technology, don’t worry—this step-by-step guide will help you get started quickly.

Installing SvelteKit

SvelteKit is a modern framework for building fast and interactive web applications. It offers powerful tools, including reactive stores, server-side rendering, and API routes, making it an excellent choice for integrating with AI-powered APIs like Ollama.

Make sure you have Bun or Node.js installed on your machine.

Open a terminal and create a new SvelteKit project:

bunx sv create sveltekit-ollama
Enter fullscreen mode Exit fullscreen mode

Then, follow the prompts to configure your project:

  • Which template would you like? SvelteKit minimal
  • Add type checking with Typescript? Yes, using Typescript syntax
  • What would you like to add to your project? none
  • Which package manager do you want to install dependencies with? bun

Then navigate into the project directory:

cd sveltekit-ollama
Enter fullscreen mode Exit fullscreen mode

Start the development server:

bun dev
Enter fullscreen mode Exit fullscreen mode

You should see your SvelteKit app running at http://localhost:5173 (or a similar port). This confirms that your SvelteKit environment is ready.

Installing Ollama

Ollama is an AI framework that generates responses using large language models (LLMs). You must download and install Ollama and the Llama model to use it locally.

Installation Steps:

  1. Visit the Ollama website and download the appropriate installer for your operating system (macOS or Windows).

  2. Once downloaded, follow the installation instructions specific to your platform.

  3. Verify that Ollama is installed by running the following command in your terminal: ollama --version. You should see the printed version number, confirming that Ollama is installed correctly.


Downloading the Llama model

The Llama model is one of the popular open-source large language models supported by Ollama. To use it, you’ll need to download and configure it.

Open your terminal and run:

ollama download llama3.2
Enter fullscreen mode Exit fullscreen mode

Replace llama3.2 with the model you want to use (if different). You can explore other available models on the Ollama website.

Confirm that the model is downloaded successfully by listing the installed models:

ollama list
Enter fullscreen mode Exit fullscreen mode

You should see llama3.2 or the model you downloaded in the list.

Configuring the Ollama API server

Ollama provides a local API server to interact with the model. Start the server with:

ollama serve
Enter fullscreen mode Exit fullscreen mode

This will run the server on http://localhost:11434 by default. You can send prompts to the server using an HTTP client, such as fetch() in your SvelteKit app.

Next steps: connecting SvelteKit and Ollama

Now that both SvelteKit and Ollama are set up, you’re ready to integrate them. The next section will cover:

  • Manage the application state with SvelteKit’s $state.
  • Sending HTTP POST requests to the Ollama API server.
  • Streaming and processing the AI-generated responses.

Continue following the article for details on the full implementation.

Understanding the code

Create the src/routes/+page.svelte file, with this code:

<script lang="ts">
import { page } from "$app/stores";

let text = $state("");
let status = $state("");
let statusInvalid = $state(false);
let question = $state(
    $page.url.searchParams.get("question") ??
        "Tell me something (a quote) positive and inspirational in traditional inspirational quote with code. Give me only the quote and the author (if exists)",
);

async function resetData() {
    question = "";
    text = "";
    status = "";
    statusInvalid = false;
}

async function translateData(language: string) {
    const myquestion = `Help me to translate this text into ${language} language: \n${question}`;
    text = "";
    status = "";
    statusInvalid = false;
    askQuestion(myquestion);
}

async function reviewData() {
    const myquestion = `Help me to review this text in a better english form, provide me only the reviewed text: \n${question}`;
    text = "";
    status = "";
    statusInvalid = false;
    askQuestion(myquestion);
}
async function readData() {
    text = "";
    const myquestion = question;
    status = "";
    statusInvalid = false;
    askQuestion(myquestion);
}

async function askQuestion(myquestion: string) {
    try {
        if (question === "") {
            throw new Error("Question is empty");
        }
        const url = "http://localhost:11434/api/generate";
        const response = await fetch(url, {
            method: "POST",
            body: JSON.stringify({
                model: "llama3.2",
                prompt: myquestion,
            }),
        });
        if (!response.ok) {
            throw new Error(
                `HTTP error! Status: ${response.status} ${response.statusText}`,
            );
        }
        if (!response.body) {
            throw new Error("Readable stream not found in the response.");
        }
        const reader = response.body.getReader();
        while (true) {
            const { done, value } = await reader.read();
            if (done) {
                status = "";
                statusInvalid = false;
                return;
            }
            const mystring = new TextDecoder().decode(value);
            const myresponse = JSON.parse(mystring);
            console.log(myresponse.response);
            text = text + myresponse.response;
        }
    } catch (error: unknown) {
        if (error instanceof Error) {
            status = error.message;
            statusInvalid = true;
            console.error("An error occurred:", error.message);
        } else {
            console.error("An unknown error occurred:", error);
        }
    }
}
</script>

<main class="container">
    <form>
        <div class="grid">
          <div>
              <textarea
                  bind:value={question}
                  name="question"
                  placeholder="Write your question to Robertito AI"
                  aria-label="Professional short bio"
                  aria-invalid="{ statusInvalid }"
                  aria-describedby="invalid-helper"
              ></textarea>
              <small id="invalid-helper">{ status }</small>

        </div>
          <div>
        <textarea bind:value={text}></textarea>
              </div>
        </div>

        <div role="group">
            <button class="lg" onclick={() => resetData()}> Reset </button>
            <button class="lg" onclick={() => reviewData()}>
                Review the text
            </button>
            <button class="lg" onclick={() => translateData("Italian")}>
                Translate to 🇮🇹
            </button>
            <button class="lg" onclick={() => translateData("English (British)")}>
                Translate to 🇬🇧
            </button>
            <button class="danger lg" onclick={() => readData()}>
                Ask me!
            </button>
        </div>


    </form>
</main>

<style>
    textarea {
        width: 100%;
        height: 40vh;
    }
</style>

Enter fullscreen mode Exit fullscreen mode

The provided SvelteKit code demonstrates a web app that interacts with an Ollama server to send prompts and receive AI-generated text responses. Here's the breakdown of the key parts.

Reactive state variables with $state

SvelteKit's $state() is a shorthand for reactive state variables, simplifying managing mutable data that affects your UI.

let text = $state("");
let status = $state("");
let statusInvalid = $state(false);
let question = $state(
    $page.url.searchParams.get("question") ??
        "Tell me something (a quote) positive and inspirational in traditional inspirational quote with code. Give me only the quote and the author (if exists)",
);
Enter fullscreen mode Exit fullscreen mode

Each variable is initialized with a default value, and any changes automatically update the UI where the variable is used. This eliminates the need for explicit event emitters or update calls.

Key Actions:

  • resetData(): Resets all state variables.
  • translateData(language): Translates question into the specified language.
  • reviewData(): Asks for a language review of the question.
  • readData(): Sends the question to the server for AI processing.

Handling HTTP streaming from Ollama

The askQuestion() function manages the interaction with Ollama's API, utilizing a streaming HTTP response. Here's how it works:

async function askQuestion(myquestion: string) {
  try {
    if (question === "") {
      throw new Error("Question is empty");
    }
    const url = "http://localhost:11434/api/generate";
    const response = await fetch(url, {
      method: "POST",
      body: JSON.stringify({
        model: "llama3.2",
        prompt: myquestion,
      }),
    });
    if (!response.ok) {
      throw new Error(
        `HTTP error! Status: ${response.status} ${response.statusText}`,
      );
    }
    if (!response.body) {
      throw new Error("Readable stream not found in the response.");
    }
    const reader = response.body.getReader();
    while (true) {
      const { done, value } = await reader.read();
      if (done) {
        status = "";
        statusInvalid = false;
        return;
      }
      const mystring = new TextDecoder().decode(value);
      const myresponse = JSON.parse(mystring);
      console.log(myresponse.response);
      text = text + myresponse.response;
    }
  } catch (error: unknown) {
    if (error instanceof Error) {
      status = error.message;
      statusInvalid = true;
      console.error("An error occurred:", error.message);
    } else {
      console.error("An unknown error occurred:", error);
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

How It Works:

  1. Input validation: ensures the question isn't empty.
  2. API call: sends a POST request with the user's myquestion prompt.
  3. Response streaming: reads chunks of data (value) from the server as they arrive and decodes them using TextDecoder.
  4. Error Handling: captures potential errors in the HTTP call, parsing, or streaming process.

This incremental rendering provides a smooth experience where users see the response as it’s being generated.

Interactive UI: reactive state in action

The UI is tightly coupled with the reactive state variables, ensuring any changes are immediately reflected. For instance:

<div class="grid">
    <div>
        <textarea
            bind:value={question}
            name="question"
            placeholder="Write your question to Robertito AI"
            aria-label="Professional short bio"
            aria-invalid="{ statusInvalid }"
            aria-describedby="invalid-helper"
        ></textarea>
        <small id="invalid-helper">{ status }</small>
    </div>
    <div>
        <textarea bind:value={text}></textarea>
    </div>
</div>
Enter fullscreen mode Exit fullscreen mode

Here, the question and text states directly bind to the <textarea> elements, providing seamless two-way data binding. When the API response updates text, the changes instantly appear in the UI.

The buttons trigger state updates or server interactions:

<div role="group">
    <button class="lg" onclick={() => resetData()}> Reset </button>
    <button class="lg" onclick={() => reviewData()}>
        Review the text
    </button>
    <button class="lg" onclick={() => translateData("Italian")}>
        Translate to 🇮🇹
    </button>
    <button class="lg" onclick={() => translateData("English (British)")}>
        Translate to 🇬🇧
    </button>
    <button class="danger lg" onclick={() => readData()}>
        Ask me!
    </button>
</div>
Enter fullscreen mode Exit fullscreen mode

Each button invokes a specific function, directly manipulating state or sending API requests, resulting in instant feedback to the user.

Enhancing error handling

Errors are displayed dynamically to the user using the status and statusInvalid states:

<small id="invalid-helper">{ status }</small>
Enter fullscreen mode Exit fullscreen mode

This ensures users are informed of any issues, such as an empty prompt or server errors, without disrupting their workflow.

Conclusion

By combining SvelteKit's $state for state management with efficient HTTP streaming handling, this integration demonstrates the power of reactive programming for modern applications. The real-time feedback and incremental rendering ensure a delightful user experience, making it ideal for AI-driven applications powered by Ollama.

With further customization and optimization, this setup can form the foundation for a wide range of interactive, AI-powered web applications.

Top comments (0)