Gao Dalie (Ilyass)

Posted on Feb 17

Why DeepSeek-R1 Is so Much Better Than o3-Mini & Qwen 2.5 MAX — Here The Results

#programming #ai #datascience #llm

As the title says, I tried using Deepseek-r1, o3-Mini and Qwen 2.5 MAX, which is getting a lot of attention. There are a lot of things being said about it,

I get the impression that by entering the market at a time when research and development were progressing and methods were being established through companies like OpenAI, DeepSeek, and Qwen, they were able to save on the huge costs of trial and error and complete the project at a low cost.

OpenAI is eager to defend its market position with the release of the O3 Mini on Friday, a direct response to Chinese startup DeepSeek’s R1 model.

Finally couldn’t sit still and launched a new inference model series o3-mini. Not only is the inference model open to free users for the first time, but the cost is also reduced by 15 times compared to the previous o1 series.

Unlike the GPT-4O and GPT model families, the “O” family of AI models focuses on reasoning tasks. They are less creative but have embedded chains of thought reasoning, making them more capable of solving complex problems, tracing back wrong analyses, and building better-structured codes.

Not to mention, Alibaba released a new version of its Qwen 2.5 artificial intelligence model, Meanwhile, Qwen is also developing an ultra-large MoE model, Qwen2.5-Max, which is trained using pre-training data of over 20 trillion tokens and a carefully designed post-processing training method.

Compared to the previous generation model, the Qwen 2.5 model has significantly improved computing power, processing speed, and versatility, and is expected to be particularly useful for business and research purposes.

However, it is rumoured that Qwen2.5-Max & o3-Mini will outperform DeepSeek-R1. In this article, we will explore the true value and future potential of o3-Mini & Qwen2.5-Max through its features and mechanisms and compare them with competing models.

By the time you finish reading, you will be excited and hopeful about the future of AI, and you will want to pick up some “information” that will help you improve business efficiency and innovation. Please stay with us until the end.

Before we start! 🦸🏻‍♀️

If you like this topic and you want to support me:
like my article; that will really help me out.👏
Follow me on my YouTube channel
Subscribe to me to get the latest article.

What is o3-Mini?

o3-mini is OpenAI’s first small inference model that supports developer-required functions. It inherits the low-cost and low-latency advantages of o1-mini and supports functions such as function calls, streaming, and structured output.

Developers can choose the strength of inference according to their needs, balancing the depth of thinking and response speed, but it does not support visual tasks, and visual reasoning still requires the use of o1.

What are the special features of o3-Mini?

Outstanding reasoning ability: Compared to its predecessor, the o1-mini, it produces more accurate and clearer answers, providing stronger reasoning ability, and allowing for deeper understanding and logical thinking, which is essential in solving complex problems.

Fast response: Response time is 24% faster than the o1-mini, at an average of 7.7 seconds. You can use it without stress even in situations where real-time response is required.

STEM-Focused: Excels in science, math, coding, and a variety of engineering specialities.

Developer-oriented features: Equipped with long-awaited features, such as function calls, structured output, and developer messages, enhancing usability as an engineering tool.

Flexible inference options: Three inference effort options are available: low, medium, and high. You can choose the best performance for your situation.

Highly cost-effective: It maintains the low cost and low latency of the o1-mini while providing more advanced features, making it cost-effective and accessible to a wide range of users.

What about the o1 pro and o3-mini?

By the way, what is the performance difference between this and the O1 Pro? Of course, this is just a matter of usage, so a strict comparison cannot be made.

But the O1 Pro is slow to begin with. No matter what code you’re working on, even if you give detailed instructions for each function, it usually takes about 3 minutes.

It requires a lot of writing code, so it’s not very practical on the O1 Pro.

That’s why the o3-mini is by far the easiest to use.

If there was a situation where the o3-mini-high would give an error, I thought I would try using the o1 pro, but so far I haven’t gotten any errors so I can’t compare them

What is Qwen2.5-Max?

Qwen 2.5 Max is a particularly high-performance version of the series and has attracted attention for demonstrating benchmark scores that surpass DeepSeek V3. Its main features are its inference speed and ability to handle a variety of tasks, as well as enhanced integration with external services and plugins.

Qwen 2.5 Max integrates not only language understanding but also image and video generation functions. This gives it the flexibility to handle not only text-based dialogue but also multimodal input

What are the special features of Qwen2.5-Max?

So, what exactly is the Qwen2.5-Max? Below are its main features.

Mixture-of-Expert (MoE) ArchitectureUnlike typical transformer-based language models, this architecture uses a combination of multiple “expert” sub-models, making it easier to optimize for specific tasks and improving processing efficiency and accuracy.

Large-scale pre-training: Pre-training is performed on more than 20 trillion tokens, giving it the ability to understand and generate language across a wide range of domains. It has the advantage of being able to handle a wide range of fields because it greedily learns from all kinds of text data, including news articles, academic papers, and social media posts.

Advanced fine-tuning (SFT and RLHF)By combining SFT (supervised fine-tuning) and RLHF (reinforcement learning based on human feedback), we have created a mechanism to generate responses that users expect more accurately and naturally. This makes it possible to respond in a way that takes into account the user’s “preferences” and “intentions,” rather than simply reproducing training data.

High scalability: Access is possible via API using Alibaba Cloud’s Model Studio service. It also supports OpenAI-compatible APIs, so it is highly compatible with existing development environments and tools, and you can expect quick implementation and expansion.

DeepSeek-R1 Vs o3-Mini Vs Qwen2.5-Max

I couldn’t hold myself to compare it with the domestic large model DeepSeek R1 that has swept the entire large model circle.

This time, we will verify the following items.

Coding


writing a p5.js script to simulate 100 colored balls bouncing inside a sphere.
Each ball should leave a fading trail showing its most recent path. 
The container sphere should rotate slowly. 
Make sure to implement proper collision detection to keep the ball 
inside the sphere.

DeepSeek-R1

Link : https://chat.deepseek.com/a/chat/s/47d12184-5459-48d1-bc2b-9e7a520d9091

let balls = [];
let numBalls = 100;
let sphereRadius = 300;
let rotationAngle = 0;

function setup() {
  createCanvas(800, 800, WEBGL);
  for (let i = 0; i < numBalls; i++) {
    balls.push(new Ball());
  }
  noStroke();
}

function draw() {
  background(0);
  rotateX(rotationAngle);
  rotateY(rotationAngle * 0.5);

  // Draw the container sphere (transparent)
  push();
  fill(0, 0, 0, 0);
  stroke(255);
  sphere(sphereRadius);
  pop();

  // Update and display balls
  for (let ball of balls) {
    ball.update();
    ball.display();
  }

  // Slowly rotate the sphere
  rotationAngle += 0.005;
}

class Ball {
  constructor() {
    this.pos = p5.Vector.random3D().mult(sphereRadius * 0.8);
    this.vel = p5.Vector.random3D().mult(2);
    this.color = color(random(255), random(255), random(255), 200);
    this.trail = [];
    this.trailLength = 20;
  }

  update() {
    // Update position
    this.pos.add(this.vel);

    // Collision detection with the sphere
    let distance = this.pos.mag();
    if (distance > sphereRadius) {
      let normal = this.pos.copy().normalize();
      this.vel = p5.Vector.reflect(this.vel, normal);
      this.pos = normal.mult(sphereRadius * 0.99); // Slightly inside the sphere
    }

    // Add current position to the trail
    this.trail.push(this.pos.copy());
    if (this.trail.length > this.trailLength) {
      this.trail.shift();
    }
  }

  display() {
    // Draw the trail
    for (let i = 0; i < this.trail.length; i++) {
      let alpha = map(i, 0, this.trail.length, 50, 255);
      fill(this.color.levels[0], this.color.levels[1], this.color.levels[2], alpha);
      noStroke();
      push();
      translate(this.trail[i].x, this.trail[i].y, this.trail[i].z);
      sphere(5);
      pop();
    }

    // Draw the ball
    fill(this.color);
    noStroke();
    push();
    translate(this.pos.x, this.pos.y, this.pos.z);
    sphere(10);
    pop();
  }
}

o3-Mini

Link: https://chatgpt.com/share/67a1fc52-e1f4-800f-91fc-8a6142d5b2d6

// Global parameters
let balls = [];
const numBalls = 100;
let containerRadius;
const ballRadius = 5;      // radius for each ball
const maxTrailLength = 30; // number of positions to store for the fading trail

function setup() {
  createCanvas(windowWidth, windowHeight, WEBGL);
  // Set the container sphere radius relative to the canvas size.
  containerRadius = min(width, height) * 0.4;

  // Create the balls with random positions (inside the sphere) and random velocities.
  for (let i = 0; i < numBalls; i++) {
    // Get a random point that is guaranteed to be inside the container
    let pos = randomPointInSphere(containerRadius - ballRadius);
    // Give a random 3D direction with a random speed between 1 and 3.
    let vel = p5.Vector.random3D().mult(random(1, 3));
    // Random color for the ball.
    let col = color(random(255), random(255), random(255));
    balls.push(new Ball(pos, vel, col));
  }

  // Optional: set a stroke weight for trails
  strokeWeight(2);
}

function draw() {
  // Draw a semi-transparent black background so that previous frames slowly fade.
  // (Adjust the second parameter to change how quickly the trails fade.)
  background(0, 20);

  // Apply a slow rotation to the entire scene.
  rotateY(frameCount * 0.005);
  rotateX(frameCount * 0.003);

  // Update and display each ball.
  for (let ball of balls) {
    ball.update();
    ball.display();
  }

  // Draw the container sphere (as a semi-transparent wireframe)
  noFill();
  stroke(255, 50);
  sphere(containerRadius);
}

// Returns a random point uniformly distributed within a sphere of given max radius.
function randomPointInSphere(maxRadius) {
  let v = p5.Vector.random3D();
  // Multiply by a random number between 0 and maxRadius to get a point inside the sphere.
  v.mult(random(0, maxRadius));
  return v;
}

// Ball class definition
class Ball {
  constructor(pos, vel, col) {
    this.pos = pos;
    this.vel = vel;
    this.col = col;
    this.trail = []; // to store previous positions
  }

  update() {
    // Move the ball.
    this.pos.add(this.vel);

    // Collision detection: If the ball goes outside the container sphere,
    // reflect its velocity. (We use the ball's radius so that the entire sphere stays in.)
    let d = this.pos.mag();
    if (d + ballRadius > containerRadius) {
      // Get the outward normal from the sphere's center.
      let normal = this.pos.copy().normalize();
      // Reflect the velocity using the reflection formula:
      //    v = v - 2*(v dot normal)*normal
      let dot = this.vel.dot(normal);
      this.vel.sub(p5.Vector.mult(normal, 2 * dot));

      // Optionally reposition the ball to exactly be inside the container.
      this.pos = normal.mult(containerRadius - ballRadius);
    }

    // Add the current position to the trail.
    this.trail.push(this.pos.copy());
    // Limit the length of the trail.
    if (this.trail.length > maxTrailLength) {
      this.trail.shift();
    }
  }

  display() {
    // Draw the fading trail.
    noFill();
    beginShape();
    // Draw a vertex at each point in the trail.
    // Older positions are drawn with lower opacity.
    for (let i = 0; i < this.trail.length; i++) {
      let pos = this.trail[i];
      // Map the opacity so that the oldest point is more transparent.
      let alpha = map(i, 0, this.trail.length - 1, 50, 255);
      stroke(red(this.col), green(this.col), blue(this.col), alpha);
      vertex(pos.x, pos.y, pos.z);
    }
    endShape();

    // Draw the ball.
    push();
    translate(this.pos.x, this.pos.y, this.pos.z);
    noStroke();
    fill(this.col);
    sphere(ballRadius);
    pop();
  }
}

Qwen2.5-Max

Link: https://chat.qwenlm.ai/c/339b97c8-c1f4-4dd4-9833-159ee2adc387

let balls = [];
let numBalls = 100;
let sphereRadius = 200;
let angleX = 0;
let angleY = 0;

function setup() {
  createCanvas(600, 600, WEBGL);
  colorMode(HSB, 360, 100, 100, 100);

  // Create balls
  for (let i = 0; i < numBalls; i++) {
    balls.push(new Ball());
  }
}

function draw() {
  background(0);

  // Rotate the container sphere
  rotateX(angleX);
  rotateY(angleY);
  angleX += 0.005;
  angleY += 0.01;

  // Draw the container sphere
  noFill();
  stroke(255, 50);
  sphere(sphereRadius * 2);

  // Update and display balls
  for (let ball of balls) {
    ball.update();
    ball.display();
    ball.checkCollision();
  }
}

class Ball {
  constructor() {
    this.position = createVector(
      random(-sphereRadius, sphereRadius),
      random(-sphereRadius, sphereRadius),
      random(-sphereRadius, sphereRadius)
    );
    this.velocity = p5.Vector.random3D().mult(random(1, 3));
    this.color = color(random(360), 80, 100, 100);
    this.history = [];
    this.maxHistory = 20;
  }

  update() {
    this.position.add(this.velocity);

    // Store position history for the trail
    this.history.push(this.position.copy());
    if (this.history.length > this.maxHistory) {
      this.history.shift();
    }
  }

  display() {
    // Draw the trail
    noFill();
    stroke(this.color);
    beginShape();
    for (let i = 0; i < this.history.length; i++) {
      let pos = this.history[i];
      vertex(pos.x, pos.y, pos.z);
    }
    endShape();

    // Draw the ball
    noStroke();
    fill(this.color);
    push();
    translate(this.position.x, this.position.y, this.position.z);
    sphere(5);
    pop();
  }

  checkCollision() {
    // Check if the ball is outside the sphere
    if (this.position.mag() > sphereRadius) {
      // Reflect the velocity vector
      this.velocity.reflect(this.position.normalize());
    }
  }
}

All the code is incorrect. For example, DeepSeek-R1 has a problem with the 100 balls — if you look at the GIF, you will see that DeepSeek-R1 misses the ball inside the circle. O3-Mini generates 100 balls, but the circle is unclear because it has a grey background. Qwen2.5-Max excels at generating the circle and 100 balls but has a size issue, making them look bigger than expected.

In conclusion, I would say that all these models require human intervention, as they cannot generate correct code with a single prompt.

Mathematics

Factorize the number 1757051 and explain why it is a good example
to test a human's ability to factorize this.

DeepSeek-R1

Link : https://chat.deepseek.com/a/chat/s/86b104f1-523b-4de5-a125-d8f4020182ee

o3-Mini

Link : https://chatgpt.com/share/67a1fe14-33d0-800f-83dc-a9fd9ffc0572

Qwen2.5-Max

Link : https://chat.qwenlm.ai/s/8aa733dc-c5d8-4bf9-bd77-848959823638

Deepseek-r1 and o3-mini generate correct answers, but o3-mini is not as good at reasoning as Deepseek-r1. Even when the question is correct, the o3-mini lacks some formulas and misses important details. Qwen2.5-Max fails to generate the correct question. Personally, if I have a complex math question, I would prefer to use Deepseek-r1.

Final Thoughts :

o3-mini is still far behind Deepseek, Deepseek-R1 is better at deep reasoning and complex task processing, especially in deep thinking, R1 not only has powerful performance but also has a lower price.

The Chat version of r1 is free and unlimited, while the Chat version of o3-mini is available to free members, but it is severely limited. It is just a trial version.

The API price of o3-mini is $1.1 for input and $4.4 for output, while r1 is 0.55 and $2.19. o3-mini is exactly twice that of r1

Even o3-mini is better at coding, but I will not use it for coding

Personally, I think Deepseek-r1 (math) & Claude (coding), who used ❤ to represent life, is the winner.

🧙‍♂️ I am an AI Generative expert! If you want to collaborate on a project, drop an inquiry here or Book a 1-on-1 Consulting Call With Me.

I would highly appreciate it if you

❣join to my Patreon:https://www.patreon.com/GaoDalie_AI
Book an Appointment with me:https://topmate.io/gaodalie_ai
Support the Content (every Dollar goes back into the video):https://buymeacoffee.com/gaodalie98d

Subscribe Newsletter for free:https://substack.com/@gaodalie

DEV Community