DEV Community

Cover image for Running local LLM (Ollama) in your nodejs project.
Aniket Dhakane
Aniket Dhakane

Posted on

Running local LLM (Ollama) in your nodejs project.

We all love AI, and since recent years the boom in Artificial Intelligence has changed the world and is taking it into a new era. For any use problem there is a use case of AI, being it asking Gemini about a cooking recipe, Chatgpt for assignments, Claude for programming, V0 for frontend design, devs and students are so much dependent on AI these days which leads to almost every new day a startup emerging featuring AI.

AI Startup meme

This leads to aspiring developers like me question on how can I make something like this? The answer is in the picture above only. API call to these models. But, they are not cheap and an unemployed student like me has no means to purchase the subscription. This lead to the idea of running the AI locally and then serving it on port for api calls. This article would give you a step by step guide on how you can setup Ollama and access the LLMs through your nodejs code.

Installing Ollama

This step is for windows users. If you are on other operating systems then follow this guide.

  • Head over to Ollama, and download their installer.

Ollama download page

  • Once done, fire up the setup and Install the application.

Ollama installer

  • This will then install the client on your machine, and now you can head over to the library section of ollama's official website to pick the model you want to use.

Ollama Models to choose from

  • Here, I'll be using codellama:7b for my machine.
  • Open your CMD or Powershell and run the command ollama run <model-name>, this will download the model on your machine if it already does not exist and then would run it.

Serving LLM on Port

  • Now you have Ollama on your system and also have the required LLM, so the next step would be to serve it on your machine's port for your node app to access it.
  • Before proceeding, close the Ollama from background and check if the default port assigned to ollama is empty or not by using this command ollama serve, if this throws an error then it means the port is occupied.
  • You'll need to clear that port before proceeding, the default port for Ollama is 11434
  • Use the following command to check what process is running on that port netstat -ano | findstr :11434
  • Note down the PID from the above result and use this command to clear the port. taskkill /PID <PID> /F
  • Once done open new cmd terminal and run the following command ollama serve
  • Now you'll see something like this which means your LLMs are now accessible through API calls.

Ollama server Stat

Using ollama npm package to for req response handling

  • Start your node project by following the commands
npm init -y
npm i typescript ollama
npx tsc --init
Enter fullscreen mode Exit fullscreen mode
  • this will create a repo for you to start working, first head over to tsconfig.json file, uncomment and set these values
"rootDir": "./src",
"outDir": "./dist",
Enter fullscreen mode Exit fullscreen mode
  • Create a src folder and inside the folder create the index.js file.
import ollama from 'ollama';

async function main() {
    const response = await ollama.chat({
        model: 'codellama:7b',
        messages: [
            {
                role: 'user', 
                content: 'What color is the sky?'
            }
        ],
    })
    console.log(response.message.content)

}

main()

Enter fullscreen mode Exit fullscreen mode
  • Now before running the code, edit the scripts in package.json
"scripts": {
    "dev": "tsc -b && node dist/index.js"
  },
Enter fullscreen mode Exit fullscreen mode
  • This would build the ts code into js code for running.
  • Run the application by using the command npm run dev inside the terminal.

VSCode SS

  • There you are. Finally being able to access your local LLM with nodejs.
  • You can read more about the node package ollama here.

Thank you for reading, Hope this article could help you in any case and if it did then feel free to connect on my socials!

Linkedin | Github

Top comments (15)

Collapse
 
thesohailjafri profile image
Sohail SJ | TheZenLabs

hey, it would be great to see chat input-output in terminal for better interaction with LLM as well as how to manage new context in token window.

Just a suggestion, great read, saving for future ref. happy coding!

Collapse
 
hisukurifu profile image
Aniket Dhakane

Actually a great idea, I will look into how I can code that! Thanks!

Collapse
 
thesohailjafri profile image
Sohail SJ | TheZenLabs

Yes. Connect if you need any help setting that up. happy to give a helping hand!

Thread Thread
 
hisukurifu profile image
Aniket Dhakane

Sent a connection req on LIn😸

Thread Thread
 
thesohailjafri profile image
Sohail SJ | TheZenLabs

Connected 🤝

Collapse
 
muhammadahsanmirza profile image
Muhammad Ahsan

Is it for typescript only?
What can a JS developer do?

Collapse
 
hisukurifu profile image
Aniket Dhakane

As Claudio said, you'll have to change the import statements.
Here's one stack overflow question similar to your situation - stackoverflow.com/questions/783506...

Collapse
 
thesohailjafri profile image
Sohail SJ | TheZenLabs

learn TS 😉

Collapse
 
klawdyo profile image
Claudio Medeiros

You need to change the "dev" command, removing the "tsc -b && " and then changing the way the lib is imported.

Collapse
 
dsaga profile image
Dusan Petkovic

Very interesting, seams easy, how resource intensive is it ? what kind of linux machine could host this?

Collapse
 
hisukurifu profile image
Aniket Dhakane

The LLM model I am using is codellama with 7Billion parameters and my ancient gtx 1650 can handle it which has only 3.2GB of VRAM memory allocated to it. And for linux distro, any which can handle the gpu tasks would do i guess.

Collapse
 
nishant_adhav_37a02a8adb3 profile image
Nishant Adhav

Thanks for information

Collapse
 
hisukurifu profile image
Aniket Dhakane

Glad I could help!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.