<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Chenrui Hu</title>
    <description>The latest articles on DEV Community by Chenrui Hu (@wholiver).</description>
    <link>https://dev.to/wholiver</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3937223%2F965ef86b-e3d3-439e-b745-afc34be452d6.png</url>
      <title>DEV Community: Chenrui Hu</title>
      <link>https://dev.to/wholiver</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/wholiver"/>
    <language>en</language>
    <item>
      <title>Your AI Sucks at Math. Fix It With One Command.</title>
      <dc:creator>Chenrui Hu</dc:creator>
      <pubDate>Sun, 31 May 2026 06:25:03 +0000</pubDate>
      <link>https://dev.to/wholiver/your-ai-sucks-at-math-fix-it-with-one-command-2f98</link>
      <guid>https://dev.to/wholiver/your-ai-sucks-at-math-fix-it-with-one-command-2f98</guid>
      <description>&lt;p&gt;You've seen this before.&lt;/p&gt;

&lt;p&gt;You ask your AI agent: &lt;strong&gt;"Find ∫ x·e^x dx"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It confidently replies: &lt;strong&gt;&lt;code&gt;e^x + C&lt;/code&gt;&lt;/strong&gt;, complete with a plausible-looking derivation. You nod. Then you check — the correct answer is &lt;code&gt;(x−1)·e^x + C&lt;/code&gt;. It was wrong by a mile, and you almost shipped it.&lt;/p&gt;

&lt;p&gt;This is the fundamental problem with AI math today: &lt;strong&gt;LLMs can talk, but they can't verify their own work.&lt;/strong&gt; They sound convincing while being catastrophically wrong. And the more complex the problem, the better the hallucination.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Math.skill&lt;/strong&gt; changes that. It's an open-source mathematical reasoning skill for AI agents — install it, and your agent stops guessing and starts verifying.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Makes It Different
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Typical AI Math Plugin&lt;/th&gt;
&lt;th&gt;Math.skill&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Workflow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Prompt → LLM → answer&lt;/td&gt;
&lt;td&gt;Prompt → 7-step pipeline → ≥2 verifications → answer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Verification&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Answer &lt;strong&gt;blocked&lt;/strong&gt; if verification fails&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Open problems&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Might hallucinate a "solution"&lt;/td&gt;
&lt;td&gt;Honestly says "this is unsolved"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Error recovery&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No mechanism&lt;/td&gt;
&lt;td&gt;Auto-backtrack, fix, recompute, re-verify&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The core differentiator: a &lt;strong&gt;verification engine&lt;/strong&gt; that runs at least 2 of 11 independent checks on every answer. No answer leaves the pipeline unverified. Period.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 7-Step Pipeline
&lt;/h2&gt;

&lt;p&gt;Every problem flows through this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;What Happens&lt;/th&gt;
&lt;th&gt;Why It Matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1. Parse&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Extract conditions, goals, variables, implicit domain constraints&lt;/td&gt;
&lt;td&gt;Catches misread problems before they waste your time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2. Model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Build formal representation: equation, function, matrix, probability space, etc.&lt;/td&gt;
&lt;td&gt;Prevents building the wrong mathematical structure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;3. Select&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Choose the optimal method from 30+ strategies&lt;/td&gt;
&lt;td&gt;Avoids brute-forcing when elegance exists&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;4. Solve&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Step-by-step with mathematical justification at every transformation&lt;/td&gt;
&lt;td&gt;Full traceability — nothing hidden&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;5. Verify&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Apply ≥2 of 11 independent verification methods&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;The differentiator&lt;/strong&gt; — catches what LLMs miss&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;6. Correct&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;If verification fails: backtrack to last known-good step, fix, recompute, re-verify&lt;/td&gt;
&lt;td&gt;No "doubling down" on wrong answers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;7. Deliver&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Exact answer (not approximate), domain conditions, verification summary&lt;/td&gt;
&lt;td&gt;You know it's right, and you know why&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Verification Engine: 11 Independent Methods
&lt;/h2&gt;

&lt;p&gt;This is the heart of Math.skill. Each method catches a different class of errors:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;ID&lt;/th&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;What It Catches&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;A&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Back-substitution&lt;/td&gt;
&lt;td&gt;Extraneous roots, sign errors — plug the answer back in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;B&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Domain check&lt;/td&gt;
&lt;td&gt;Division by zero, negative radicands, log(0), arcsin(2)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;C&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Boundary analysis&lt;/td&gt;
&lt;td&gt;Missed interval endpoints, parameter edge cases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;D&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reverse derivation&lt;/td&gt;
&lt;td&gt;Irreversible step errors — work backwards from answer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;E&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Numerical sampling&lt;/td&gt;
&lt;td&gt;Coefficient drift, off-by-factor — test with specific values&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;F&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Dimensional analysis&lt;/td&gt;
&lt;td&gt;Unit mismatches, P &amp;gt; 1, variance &amp;lt; 0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;G&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Limits &amp;amp; special cases&lt;/td&gt;
&lt;td&gt;Degenerate behavior as parameters approach 0 or ∞&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;H&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cross-validation&lt;/td&gt;
&lt;td&gt;Solve with a &lt;strong&gt;completely different independent method&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;I&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Counterexample search&lt;/td&gt;
&lt;td&gt;Disprove false universal claims by construction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;J&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Formal logic check&lt;/td&gt;
&lt;td&gt;∀∃ order errors, necessary vs. sufficient, circular reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;K&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Computational consistency&lt;/td&gt;
&lt;td&gt;det(A−λI) = 0, total probability = 1, trace = sum of eigenvalues&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;At least two methods per problem.&lt;/strong&gt; The engine selects which ones based on the problem type. You don't have to think about it — it just works.&lt;/p&gt;




&lt;h2&gt;
  
  
  34 Math Categories. One Skill.
&lt;/h2&gt;

&lt;p&gt;Math.skill covers everything from arithmetic to abstract algebra. Each category has its own verification protocol and common-error checklist:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Arithmetic · Algebra · Equations/Inequalities · Functions
Geometry · Trigonometry · Sequences · Combinatorics
Probability/Statistics · Limits · Differentiation · Integration
Multivariable Calculus · Linear Algebra · ODEs
Complex Analysis · Real Analysis · Abstract Algebra
Topology · Number Theory · Discrete Math · Optimization
Mathematical Modeling · Proofs · Counterexamples
Solution Checking · Problem Generation · Research-Level Problems
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Not a one-size-fits-all.&lt;/strong&gt; Each category gets targeted handling.&lt;/p&gt;




&lt;h2&gt;
  
  
  It Won't Lie About Unsolved Problems
&lt;/h2&gt;

&lt;p&gt;Ask it to "prove the Riemann Hypothesis" and you won't get a hallucinated Nobel-worthy breakthrough. You'll get:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"This is a known open problem. Here's what I can provide: partial results, known bounds, and why this remains unsolved."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Honesty is the baseline.&lt;/strong&gt; If a problem is open, it says so. If it can only give partial results, it clearly labels what's proven vs. conjectured.&lt;/p&gt;




&lt;h2&gt;
  
  
  Preemptive Error Prevention: 8 Guard Categories
&lt;/h2&gt;

&lt;p&gt;The most common AI math failures are blocked before they happen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Algebra&lt;/strong&gt;: Check division by zero &lt;em&gt;before&lt;/em&gt; dividing. Verify roots after squaring. Re-expand after factoring.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inequalities&lt;/strong&gt;: Sign reversal on multiply-by-negative. Case analysis for variable expressions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Functions&lt;/strong&gt;: Find domain first. Distinguish critical points from extrema. Check non-differentiable points.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Probability&lt;/strong&gt;: Reject P ∉ [0,1]. Reject negative variance. Verify total probability = 1.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Calculus&lt;/strong&gt;: Verify L'Hôpital conditions. State Taylor remainder order. Always add &lt;code&gt;+C&lt;/code&gt;. Check improper integral convergence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linear Algebra&lt;/strong&gt;: Check matrix dimensions. Verify Av = λv. Verify A = PDP⁻¹.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geometry&lt;/strong&gt;: Don't rely on visual intuition. State theorem conditions explicitly. Explain auxiliary constructions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Abstract Math&lt;/strong&gt;: Verify all definition components. Check quantifier order (∀ε∃δ ≠ ∃δ∀ε). Verify well-definedness.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  One Command to Install
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add Wholiver/Math.Skill
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;That's it.&lt;/strong&gt; No config. No API keys. No dependencies to wrestle with.&lt;/p&gt;

&lt;p&gt;Works with: &lt;strong&gt;Claude Code · GitHub Copilot · Cursor · Windsurf · Codex · OpenCode&lt;/strong&gt; — any AI agent that supports &lt;a href="https://skills.sh" rel="noopener noreferrer"&gt;skills.sh&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MIT Licensed.&lt;/strong&gt; Free to use. Free to modify. Free to ship with your product.&lt;/p&gt;




&lt;h2&gt;
  
  
  Who Is This For?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Students&lt;/strong&gt; — homework help with verified solutions. Learn the &lt;em&gt;how&lt;/em&gt; and the &lt;em&gt;why&lt;/em&gt;, not just the answer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Teachers&lt;/strong&gt; — generate well-posed problems with full solutions. Check student answers against verified references.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Researchers&lt;/strong&gt; — quickly validate intermediate derivations. Catch errors before they propagate into your paper.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developers&lt;/strong&gt; — if your AI coding agent touches math, stop it from hallucinating incorrect calculations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Everyone who's been burned by AI math&lt;/strong&gt; — you know the feeling. This is the antidote.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Your AI agent is brilliant at many things. Math isn't one of them — unless you give it the right tools.&lt;/p&gt;

&lt;p&gt;Math.skill gives your agent what it's missing: a mathematician's discipline. Parse, model, solve, verify, correct, deliver. Every time. No exceptions.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"One question. A verified answer."&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add Wholiver/Math.Skill
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/Wholiver/Math.Skill" rel="noopener noreferrer"&gt;GitHub → Wholiver/Math.Skill&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>opensource</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
