DEV Community

Cover image for How to Run DeepSeek R1 Locally (Using Ollama + ChatboxAI)
kshitij Bhatnagar
kshitij Bhatnagar

Posted on

How to Run DeepSeek R1 Locally (Using Ollama + ChatboxAI)

If you want to run DeepSeek R1 locally on your system, there's no need to worry. This guide is written in a simple and easy-to-follow manner, explaining step-by-step how to use Ollama and ChatboxAI to get it running.


πŸ–₯️ System Requirements (Based on GPU/RAM)

Each model has different hardware requirements, so first, check which model your system can support:

Model GPU Required VRAM (GPU Memory) RAM (System Memory) Storage (ROM)
DeepSeek R1 1.5B No GPU / Integrated GPU 4GB+ 8GB+ 10GB+
DeepSeek R1 7B GTX 1650 / RTX 3050 6GB+ 16GB+ 30GB+
DeepSeek R1 14B RTX 3060 / RTX 4060 12GB+ 32GB+ 60GB+
DeepSeek R1 33B RTX 4090 / A100 24GB+ 64GB+ 100GB+
  • πŸ‘‰ If your system has GTX 1650 or lower, you can only run DeepSeek R1 1.5B or at most 7B.
  • πŸ‘‰ For 7B, at least 16GB RAM is required.
  • πŸ‘‰ If you have a GPU lower than GTX 1650 (or an integrated GPU), only use 1.5B to avoid crashes.

βš™οΈ Step-by-Step Installation Guide

1️⃣ Install Ollama (Base for Llama Models)

Ollama is a lightweight tool that helps run LLMs (Large Language Models) locally. Install it first:

πŸ”— Ollama Installation Link

πŸ‘‰ For Windows Users:

  • Download the installer and install it (just click Next-Next).
  • Open CMD and check by running:
  ollama run llama2
Enter fullscreen mode Exit fullscreen mode

If this command runs successfully, the installation is complete.

πŸ‘‰ For Mac Users:

  • Open Terminal and run:
curl -fsSL https://ollama.com/install.sh | sh
Enter fullscreen mode Exit fullscreen mode

2️⃣ Download the DeepSeek R1 Model

Use the following command to pull the model:

ollama pull deepseek-ai/deepseek-coder:7b
Enter fullscreen mode Exit fullscreen mode

- πŸ‘‰ If you want to run 1.5B instead of 7B, use:

ollama pull deepseek-ai/deepseek-coder:1.5b
Enter fullscreen mode Exit fullscreen mode

⚠ This download may take some time depending on your internet speed. Once downloaded, you can run it using Ollama.

3️⃣ Install ChatboxAI (Optional GUI for Better Experience)

If you want a Graphical User Interface (GUI), ChatboxAI is the best tool to interact with local AI models.

πŸ”— ChatboxAI Installation Link

Installation Steps:

  • Ensure Python 3.10+ is installed.
  • Open Command Prompt (CMD) and run:
git clone https://github.com/oobabooga/text-generation-webui.git
cd text-generation-webui
pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode
  • Start the server:
python server.py
Enter fullscreen mode Exit fullscreen mode
  • Open your browser and go to localhost:7860, then select your model.

πŸš€ Running DeepSeek R1 (Final Step)

Once everything is installed, it’s time to run the model:

πŸ‘‰ Open CMD and run:

ollama run deepseek-ai/deepseek-coder:7b
Enter fullscreen mode Exit fullscreen mode

πŸ‘‰ If 7B is not running, try with 1.5B:

ollama run deepseek-ai/deepseek-coder:1.5b
Enter fullscreen mode Exit fullscreen mode

πŸ‘‰ If you are using ChatboxAI, just open the browser and interact with the model through the GUI.

Now you can use DeepSeek R1 for coding, AI chat, and optimizing your workflow! 😎πŸ”₯

πŸ› οΈ Common Problems & Solutions

❌ 1️⃣ Model crashes due to low VRAM?
βœ” Try 1.5B instead of 7B.
βœ” Increase Windows Pagefile (Virtual Memory settings).

❌ 2️⃣ Model response is too slow?
βœ” Use SSD instead of HDD.
βœ” Close background applications.
βœ” Optimize RAM usage.

❌ 3️⃣ β€˜Command not found’ error in CMD?
βœ” Check if Ollama is installed correctly.
βœ” Ensure Python and dependencies are installed.

🀩 Conclusion

If you followed this guide correctly, you can now run DeepSeek R1 locally without relying on third-party APIs. This is a privacy-friendly and cost-effective solution, perfect for developers and freelancers.

If you face any issues, drop a comment, and you’ll get help! πŸš€πŸ”₯

API Trace View

Struggling with slow API calls? πŸ•’

Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more β†’

Top comments (0)

The Most Contextual AI Development Assistant

Pieces.app image

Our centralized storage agent works on-device, unifying various developer tools to proactively capture and enrich useful materials, streamline collaboration, and solve complex problems through a contextual understanding of your unique workflow.

πŸ‘₯ Ideal for solo developers, teams, and cross-company projects

Learn more

πŸ‘‹ Kindness is contagious

Please leave a ❀️ or a friendly comment on this post if you found it helpful!

Okay