DEV Community

Cover image for Deepseek-R1 Now Available on Novita AI: A Strong Competitor to OpenAI o1
Novita AI
Novita AI

Posted on • Originally published at blogs.novita.ai

Deepseek-R1 Now Available on Novita AI: A Strong Competitor to OpenAI o1

DeepSeek, a Chinese AI lab, has released an open-source version of DeepSeek-R1, a reasoning model that competes closely with OpenAI’s o1 on multiple benchmarks. Now available on Novita AI, this model is designed to handle complex tasks such as logical inference, mathematics, and programming, making it a versatile tool for developers and businesses.

What sets DeepSeek-R1 apart is its reasoning-first architecture, which allows it to self-check its outputs for accuracy. While this process may take longer than traditional language models, the results are more reliable, especially for use cases in physics, science, and math.

This article explores the unique capabilities of DeepSeek-R1, it's performance across benchmarks, and how it can be integrated into your workflows through Novita AI’s APIs.

What is DeepSeek-R1?

DeepSeek-R1 is an open-source reasoning model developed to tackle tasks requiring logical reasoning, advanced mathematics, and programming. It builds on the earlier DeepSeek-R1-Zero by combining reinforcement learning with supervised fine-tuning to improve output clarity and coherence.

Reasoning models like R1 are engineered to fact-check their outputs, reducing errors that often occur in traditional language models. This makes them vital for domains requiring a high level of accuracy and transparency, such as research, education, and decision-making.

What makes DeepSeek-R1 particularly competitive and attractive is its open-source nature. The model is available under an MIT license, enabling unrestricted commercial use. Unlike proprietary models, this approach allows developers and researchers to fully explore its architecture, modify it to suit specific needs, and deploy it across various workflows.

How Was DeepSeek-R1 Developed?

This section examines how DeepSeek-R1 was created, starting with its predecessor, DeepSeek-R1-Zero.

DeepSeek-R1-Zero

DeepSeek-R1 began with R1-Zero, a model trained entirely through reinforcement learning. This approach allowed the model to develop strong reasoning capabilities, but it came with several challenges:

  • The outputs were often difficult to read.

  • The model sometimes mixed languages within its responses, making it less user-friendly.

These limitations made R1-Zero impractical for real-world applications despite it's logical soundness.

Challenges of Pure Reinforcement Learning

The reliance on pure reinforcement learning led to outputs that were logically valid but poorly structured. Without the guidance of supervised data, the model struggled to effectively communicate its reasoning. This lack of clarity was a barrier for users who required precision and transparency in results.

Improvements with DeepSeek-R1

To overcome these challenges, DeepSeek adopted a hybrid approach when developing R1. By combining reinforcement learning with supervised fine-tuning, the team incorporated curated datasets to enhance the model’s readability and coherence. This change addressed critical issues:

  • Language mixing was significantly reduced.

  • Fragmented reasoning was improved, resulting in clearer outputs.

These advancements made DeepSeek-R1 a more practical and reliable tool for real-world applications.

DeepSeek-R1 Performance Benchmark

DeepSeek-R1 excels in math, achieving top scores of 97.3% on MATH-500 and 79.8% on AIME 2024, outperforming competitors. In coding, it stands out with 49.2% on SWE-bench Verified and 65.9% on Live Code Bench, showcasing its well-rounded expertise across both domains.

DeepSeek R1 benchmark

All models are evaluated with a maximum generation length of 32,768 tokens, using specific sampling parameters (temperature 0.6, top-p 0.95, and 64 responses per query) to calculate pass@1 for benchmarks.

Try DeepSeek-R1 Demo Now

DeepSeek-R1-Distill Models

Distillation, or knowledge distillation, is a machine learning method that transfers knowledge from a larger model to a smaller one. The aim is to develop a more efficient model that can achieve similar performance to the larger model.

DeepSeek has also released distilled versions of R1, offering smaller models that retain much of the original model’s capabilities while being more computationally efficient. These models are fine-tuned using data generated by DeepSeek-R1 and are available in sizes from 1.5 billion to 70 billion parameters.

benchmark of  DeepSeek-R1-Distill Models

Source: DeepSeek’s release paper

Access the DeepSeek-R1 API via Novita AI

Novita AI’s platform simplifies the deployment of DeepSeek-R1 by providing simple APIs and affordable GPU cloud infrastructure. Developers can integrate the model seamlessly into there applications without worrying about hardware setup or scalability.

To get started with DeepSeek-R1 on Novita AI, follow these steps:

Step 1: Go to Novita AI and log in using your Google, GitHub account, or email address.

Step 2: Try the DeepSeek-R1 Demo.

Step 3: Monitor the LLM Metrics Console of the model on Novita AI.

Step 4: Get your API Key:

  • Navigate to “Key Management” in the settings.

  • A default key is created upon your first login.

  • To generate additional keys, click on “+ Add New Key.”

Step 5: Set up your development environment and configure options such as content, role, name, and prompt

API Integration

Novita AI provides client libraries for Curl, Python, and JavaScript, making it easy to integrate DeepSeek-R1 into your projects:

For Python users:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.novita.ai/v3/openai",
    api_key="<YOUR Novita AI API Key>",
)

model = "deepseek/deepseek-r1"
stream = True # or False
max_tokens = 2048
system_content = """Be a helpful assistant"""
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }

chat_completion_res = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": system_content,
        },
        {
            "role": "user",
            "content": "Hi there!",
        }
    ],
    stream=stream,
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p,
    presence_penalty=presence_penalty,
    frequency_penalty=frequency_penalty,
    response_format=response_format,
    extra_body={
      "top_k": top_k,
      "repetition_penalty": repetition_penalty,
      "min_p": min_p
    }
  )

if stream:
    for chunk in chat_completion_res:
        print(chunk.choices[0].delta.content or "", end="")
else:
    print(chat_completion_res.choices[0].message.content)
Enter fullscreen mode Exit fullscreen mode

For JavaScript users:

import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://api.novita.ai/v3/openai",
  apiKey: "<YOUR Novita AI API Key>",
});
const stream = true; // or false

async function run() {
  const completion = await openai.chat.completions.create({
    messages: [
      {
        role: "system",
        content: "Be a helpful assistant",
      },
      {
        role: "user",
        content: "Hi there!",
      },
    ],
    model: "deepseek/deepseek-r1",
    stream,
    response_format: { type: "text" },
    max_tokens: 2048,
    temperature: 1,
    top_p: 1,
    min_p: 0,
    top_k: 50,
    presence_penalty: 0,
    frequency_penalty: 0,
    repetition_penalty: 1
  });

  if (stream) {
    for await (const chunk of completion) {
      if (chunk.choices[0].finish_reason) {
        console.log(chunk.choices[0].finish_reason);
      } else {
        console.log(chunk.choices[0].delta.content);
      }
    }
  } else {
    console.log(JSON.stringify(completion));
  }
}

run();
Enter fullscreen mode Exit fullscreen mode

For Curl users:

curl "https://api.novita.ai/v3/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR Novita AI API Key>" \
  -d @- << 'EOF'
{
    "model": "deepseek/deepseek-r1",
    "messages": [
        {
            "role": "system",
            "content": "Be a helpful assistant"
        },
        {
            "role": "user",
            "content": "Hi there!"
        }
    ],
    "response_format": { "type": "text" },
    "max_tokens": 2048,
    "temperature": 1,
    "top_p": 1,
    "min_p": 0,
    "top_k": 50,
    "presence_penalty": 0,
    "frequency_penalty": 0,
    "repetition_penalty": 1
}
EOF
Enter fullscreen mode Exit fullscreen mode

Pro Tips from DeepSeek

To achieve optimal performance when using DeepSeek-R1 models, it’s recommended to use the following configurations:

  1. Temperature Settings: Set the temperature within the range of 0.5-0.7 (0.6 is recommended) to avoid incoherent outputs or endless repetitions.

  2. Prompt Design: Avoid adding a system prompt; include all instructions directly in the user prompt.

  3. Mathematical Problems: For tasks involving math, add directives like: “Please reason step by step, and put your final answer within \boxed{}.”

  4. Evaluation: Conduct multiple tests and average results to ensure reliable benchmarking.

DeepSeek-R1 vs. OpenAI o1: Benchmark Performance

benchmark of deepseek-r1

DeepSeek-R1 competes directly with OpenAI o1 across multiple benchmarks, often matching or surpassing it's performance.

Mathematics Benchmarks

On advanced math tasks, DeepSeek-R1 demonstrates strong performance. It scores 79.8% on AIME 2024, slightly ahead of OpenAI o1-1217 at 79.2%. On MATH-500, which involves solving diverse high-school-level problems, DeepSeek-R1 leads with 97.3%, outperforming OpenAI o1-1217’s 96.4%.

Coding Benchmarks

In programming benchmarks, DeepSeek-R1 achieves 96.3% on Codeforces, just behind OpenAI o1-1217’s 96.6%. However, on SWE-bench Verified, which evaluates reasoning in software engineering tasks, DeepSeek-R1 scores 49.2%, slightly ahead of OpenAI o1-1217’s 48.9%.

General Knowledge Benchmarks

While OpenAI o1 leads in general knowledge tasks, DeepSeek-R1 remains competitive. It scores 71.5% on GPQA Diamond compared to OpenAI o1-1217’s 75.7%. On MMLU, a multitask benchmark, DeepSeek-R1 achieves 90.8%, just behind OpenAI o1-1217’s 91.8%.

These results underline DeepSeek-R1’s strengths in reasoning-intensive domains, particularly in mathematics and programming.

Unleash the Power of DeepSeek-R1 Today

DeepSeek-R1 is a significant advancement in reasoning-focused AI, offering transparency, reliability, and flexibility. Its strong performance in mathematics and software engineering tasks makes it a powerful tool for developers seeking solutions in research, education, and technical workflows.

With its availability on Novita AI, deploying DeepSeek-R1 is straightforward and cost-effective. Whether you’re tackling complex mathematical problems or automating programming tasks, DeepSeek-R1 offers a robust and accessible solution for diverse AI applications.

Originally from Novita AI

About Novita AI

Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing an affordable and reliable GPU cloud for building and scaling.

Top comments (0)