Stop paying open AI api for your chat app!

#openai #llm #chatgpt #javascript

I love tinkering with AI however I'm no AI farmer (those peeps are hot right now).

Piping everything through OpenAI API kinds of creeps me.
Because 'merica is known for spying and using populations private for imperialist fourberies.
And foremost, 'caus I don't wanna pay! oh yeah!

Solution:

Install Ollama
Chose a lightweight model
call the local api
make supposition on how to deploy

Install Ollama

You follow the link and do the install: THE Link.

Chose a lightweight model

I chose deepseek-r1:7b

ollama pull deepseek-r1:7b you might have to ctrl+B to exit the prompt.

My old laptop is not starting to catch fire while running completion feel free to use beefier models if you can.

Pull as many as your disk space allow if you want.

Call the API

Ollama daemon must be running ollama serve
No need to be in the prompt of a model.
detailed documentation
you can set any model you pulled in the model params

Easy to use generate endpoint:

curl http://localhost:11434/api/generate -d '{
  "model": "deepseek-r1:7b",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

OpenAI API style chat:

curl http://localhost:11434/api/chat -d '{
  "model": "deepseek-r1:7b",
  "messages": [
    {
      "role": "user",
      "content": "why is the sky blue?"
    }
  ]
}'

And now, you are a good softwarer and go implement in your projectS using the http libs you usually use.

What? you are too lazy and just want to use the OpenAI client?
Well of course you can! documentation

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:11434",
  apiKey: 'ollama'
});

Go play with that now!

AI #OpenAI #Ollama #BreakFree #deepseek

How I fixed 20 seconds of lag for every user in just 20 minutes.

Our AI agent was running 10-20 seconds slower than it should, impacting both our own developers and our early adopters. See how I used Sentry Profiling to fix it in record time.

DEV Community