DEV Community

Ollama: How to Easily Run LLMs Locally on Your Computer

Richa Parekh on June 26, 2025

I just found an interesting open-source tool called Ollama. It's a command-line application that lets you run Large Language Models (LLMs) on your ...

Read full post

Solve Computer Science • Jun 26

Try the qwen2.5-coder model family. Yes, 4GB is insufficient to run anything useful. I'm trying with qwen2.5-coder:14b-instruct-q2_K (so low quantization and higher parameters) and it's not bad at all. The speed and quality is decent all considering. You'll need about 20GB of RAM, however. Be aware I got Chinese language only replies when running 1.5B models of that family.

Richa Parekh • Jun 26

Thanks for the tip! I’ll definitely check out qwen2.5-coder

Solve Computer Science • Jun 26

ollama.com/library/qwen2.5-coder/tags You'll have to experiment the smallest models with different quantization levels and avoid swapping to disk during inference.

Richa Parekh • Jun 26

Thanks for the link! I’ll explore the smallest models and test their performance. Thank you for the suggestion!

Dotallio • Jun 26

Totally get the RAM struggle with local LLMs, I had a similar bottleneck running anything larger than a 3B model too.

Have you found any tricks to make chat-style workflows smoother in the CLI, or do you just keep it basic?

Richa Parekh • Jun 26

Since I'm still learning the concept and getting an understanding of how everything works, I'm sticking to the basics for now

Alexander Ertli • Jun 26

Hey,

Welcome to the genAI techspace.
There is nothing wrong in using smaller models, i resort to them all the time.

If you are interested you could try a much smaller model like smollm2:135m or qwen:0.5b they should be much more responsive with your hardware.

Also typically Ollama tries to run models on using the GPU or at least partially if you have a compatible one.

I hope this helps.

Richa Parekh • Jun 27

Yes, I will check out the smaller models. Thanks for the useful advice.

Arindam Majumder • Jun 27

Ollama is Great. You can also use Docker Model Runner for this

Richa Parekh • Jun 27

Yeah, ollama is a valuable tool. Thanks for sharing.

Praveen Rajamani • Jun 26

Thanks for being clear about the hardware limits. Many people try to run local LLMs, thinking it will just work, then get frustrated when it is slow or crashes. Posts like this help save a lot of time and confusion.

Richa Parekh • Jun 26

Appreciate that! I'm glad the post was helpful.

Nathan Tarbert • Jun 26

This is extremely impressive, love how you documented the process and called out the RAM struggle directly. Makes me wanna try it on my old laptop now

Richa Parekh • Jun 27

Thank you for the appreciation. Ollama is definitely worth a try.