Just today, DeepSeek has just released this 19-page paper titled DeepSeekMath-V2. At this particular juncture, reading this paper evokes a sense of solemnity and composure.
Foreign media have reported that in order to bypass the ban and obtain Nvidia's high-end chips, several Chinese tech giants are having to go abroad and transfer their AI model training computing power centers to overseas data centers. This is a "computing power exodus" for survival.
However, under such a suffocating siege of computing power, DeepSeek demonstrated another possibility of "looking inward" in its paper: If external computing power is restricted, then we will evolve the inference efficiency to the extreme.
That score that surpassed the limit of human capabilities
Let's look at the result, because this result is the best reward for long-termism.
In the recently concluded 2024 Putnam Mathematics Competition, DeepSeekMath-V2 scored 118 out of 120 points.
This is the most prestigious undergraduate mathematics competition in North America, with an extremely high level of difficulty.
And the highest human score in this competition was only 90 points.
In both IMO 2025 (International Mathematical Olympiad) and CMO 2024, it achieved top-level gold medal results.
What is even more ironic is that in the comparative test, it completely outperformed Google Gemini 2.5 Pro, which has top-level computing power support, as well as OpenAI's GPT-5-Thinking-High.
How did it acquire the skill of being able to fight both left and right simultaneously?
Why is DeepSeekMath-V2 so powerful?
Refuse to provide the answer. Please show me the process instead.
Traditional AI training is like training a puppy: you give it an arithmetic problem, and as long as the final answer is correct, you give it a bone.
But there is a major flaw here: Just because the answer is correct doesn't mean one has truly understood.
AI might simply guess randomly, or use incorrect logic to arrive at the correct numbers.
DeepSeekMath-V2, even when dealing with open-ended questions without a standard answer, no longer gets obsessed with the result.
It introduced an extremely strict marking teacher who used a magnifying glass to scrutinize each derivation step and give marks accordingly: 1 point for logical rigor, 0.5 points for minor flaws, and 0 points for completely fabricated content.
Who will supervise the exam graders?
But here is an even more interesting question: If that marking teacher, that is, the AI itself, would it pretend to understand when it doesn't, or create unnecessary problems on its own?
To solve this problem, DeepSeek developed a Meta-Verification (meta-validation), which is essentially a check on the checker itself.
This is like giving the examiners an additional supervisor. The supervisor's job is not to do the questions, but to check whether the examiners' grading is reasonable:
"You pointed out that this step is wrong. Is it really wrong?"
"You gave a perfect score, but there were obvious jumps in your performance!"
Self-reflection in a split-personality manner
What's even more interesting is that DeepSeek has adjusted the reward mechanism: even if you make a mistake in the question, as long as you accurately point out where you went wrong, you can still get a high score!
This forces the model to conduct numerous internal deductions, self-denials and corrections before it can output the final answer.
It no longer acts with blind confidence, but has learned to be skeptical.
"Problem Solvers" and "Mathematicians" on the Road to AGI
Why did DeepSeekMath-V2 manage to outperform in the context of limited computing power? Because it completely revolutionized the way AI thinks.
Let me give an example. Currently, the evolution of AI has two paths:
The top-performing problem solvers
The previous large-scale models (including many of the current state-of-the-art ones) were essentially top-notch problem solvers.
It is like the student with the best memory in that class, having an extensive database of questions in his mind. When faced with a problem, it quickly retrieves the memory and answers it instantly through pattern matching.
Its limitation is that it can only solve problems that have been encountered or are similar to those encountered before. It is consuming the existing knowledge. Once it encounters completely new and unfamiliar problems outside the question bank, it will start to fabricate randomly.
A True Mathematician
DeepSeekMath-V2 forced itself to evolve into a serious mathematician.
Mathematicians do not rely on rote memorization. When confronted with an unknown conjecture, mathematicians possess the "meta-capability of learning and reasoning".
It will pause to think carefully, make step-by-step deductions, and during this process, create new knowledge that does not exist in the training data.
The paper reveals the source of this ability - Self-Verification.
DeepSeek trained an extremely rigorous validator that not only scored the results but also meticulously scrutinized the derivation process for any flaws.
Proof Generator
Responsible for proposing bold solutions to problems.
Proof Verifier
Taking on the role of a stern mentor, he would coldly point out: "This step involves a logical leap. Please start over."
Meta-Verification
Even the model would reflect on the thought, "Was the mistake I made just a mistake?" 。
Great things often emerge from scarcity.
Great things often emerge from scarcity. I have always believed that resource scarcity is the catalyst for innovation, while abundant resources tend to breed mediocrity.
While the outside world was discussing how to transport GPUs and how to circumvent the ban, DeepSeek chose to train the AI in the uncharted territory of algorithms, enabling it to learn self-denial and slow thinking.
DeepSeekMath-V2 is not only a mathematical model; it is also a metaphor.
It tells us that there is more than one path to AGI. If that wide road, paved with NVIDIA graphics cards, is blocked, then like mathematicians, using extreme logic and rationality, and carving out a small path in the wilderness, perhaps one can go further.
In this era of rapid change, learning to think slowly like a mathematician is not only the evolutionary direction of AI, but also the proper attitude that every technology professional should adopt.
Would you like to learn more about the practical implementation methods of DeepSeek and cutting-edge AI tools? Click here to enter the first-class cabin of the AI future: https://www.deployai365.com/




Top comments (0)