DEV Community

Cover image for Deploy Deepseek-R1: Guide to run multiple variants on AWS
Agam Jain
Agam Jain

Posted on

1

Deploy Deepseek-R1: Guide to run multiple variants on AWS

Hi Everyone

Deepseek-R1 is everywhere. So, we have done the heavy lifting for you to run each variant on the cheapest and highest-availability GPUs. All these configurations have been tested with vLLM for high throughput and auto-scale with the Tensorfuse serverless runtime.

Below is the table that summarizes the configurations you can run.

Supported GPU types for each variant of Deepseek R1<br>

Take it for an experimental spin

You can find the Dockerfile and all configurations in the GitHub repo below. Simply open up a GPU VM on your cloud provider, clone the repo, and run the Dockerfile.

Github Repo: https://github.com/tensorfuse/tensorfuse-examples/tree/main/deepseek_r1

Deploy a production-ready service on AWS using Tensorfuse

If you are looking to use Deepseek-R1 models in your production application, follow our detailed guide to deploy it on your AWS account using Tensorfuse.

The guide covers all the steps necessary to deploy open-source models in production:

  1. Deployed with the vLLM inference engine for high throughput
  2. Support for autoscaling based on traffic
  3. Prevent unauthorized access with token-based authentication
  4. Configure a TLS endpoint with a custom domain

AWS Q Developer image

Your AI Code Assistant

Implement features, document your code, or refactor your projects.
Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE

Top comments (1)

Collapse
 
samagra_sharma_0b6d85c152 profile image
Samagra Sharma

Wow looks good !

Qodo Takeover

Introducing Qodo Gen 1.0: Transform Your Workflow with Agentic AI

Rather than just generating snippets, our agents understand your entire project context, can make decisions, use tools, and carry out tasks autonomously.

Read full post