<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Krishi Attri</title>
    <description>The latest articles on DEV Community by Krishi Attri (@kattri).</description>
    <link>https://dev.to/kattri</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4011651%2Fb6b65d0a-943b-4aec-b925-8b4c646f524b.png</url>
      <title>DEV Community: Krishi Attri</title>
      <link>https://dev.to/kattri</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kattri"/>
    <language>en</language>
    <item>
      <title>Zero-shot point-cloud registration actually transfers: BUFFER-X inside splatreg</title>
      <dc:creator>Krishi Attri</dc:creator>
      <pubDate>Thu, 02 Jul 2026 05:04:13 +0000</pubDate>
      <link>https://dev.to/kattri/zero-shot-point-cloud-registration-actually-transfers-buffer-x-inside-splatreg-5d55</link>
      <guid>https://dev.to/kattri/zero-shot-point-cloud-registration-actually-transfers-buffer-x-inside-splatreg-5d55</guid>
      <description>&lt;p&gt;&lt;a href="https://github.com/Archerkattri/splatreg" rel="noopener noreferrer"&gt;splatreg&lt;/a&gt; registers 3D Gaussian splats: two 3DGS scans in, one SE(3)/Sim(3) transform out, optionally one fused splat. Its coarse-init stage seeds a Levenberg–Marquardt refine, and until recently the practical default for real scans was a classical FPFH+RANSAC seed. This post is about what happened when I swapped in &lt;a href="https://github.com/MIT-SPARK/BUFFER-X" rel="noopener noreferrer"&gt;BUFFER-X&lt;/a&gt; (ICCV 2025), a &lt;em&gt;zero-shot&lt;/em&gt; learned registration model — and, since it's probably the more useful part, the exact recipe for building its 2023-era CUDA extensions on a 2026 stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why zero-shot matters for a splat registrar
&lt;/h2&gt;

&lt;p&gt;Per-dataset-trained backbones like PSReg and DiffusionPCR top the 3DMatch leaderboard at 95%+ registration recall. But a splat registrar should not require training a per-scene or per-sensor model to align two captures someone made with a phone and a drone. So splatreg deliberately keeps a generalist seed: BUFFER-X is a single pretrained model that claims to register across sensors and scales with no per-dataset tuning. The question was whether the claim survives contact with the official benchmarks when wired in as a real seed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The numbers
&lt;/h2&gt;

&lt;p&gt;I ran the complete official &lt;code&gt;gt.log&lt;/code&gt; pair sets — not a curated subset — with a pair counted as recalled at RRE &amp;lt; 15° and RTE &amp;lt; 0.3 m:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;3DMatch&lt;/strong&gt; (8/8 scenes, n=1619): BUFFER-X seed &lt;strong&gt;0.962&lt;/strong&gt; recall, median RRE 1.46°, vs &lt;strong&gt;0.630&lt;/strong&gt; / 2.12° for the classical FPFH seed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3DLoMatch&lt;/strong&gt; (the hard 10–30% overlap split, n=1781): &lt;strong&gt;0.777&lt;/strong&gt; / 2.77° vs &lt;strong&gt;0.122&lt;/strong&gt; / 103.4°.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That 3DLoMatch line is the story: 6.4× the recall, and the classical seed's median error of 103° means it isn't "less accurate" there — it's landing in random basins. BUFFER-X won every scene on both splits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The caveat that keeps these numbers honest:&lt;/strong&gt; both seeds were pushed through the &lt;em&gt;identical&lt;/em&gt; lighter &lt;code&gt;feature_align&lt;/code&gt; refine, so the comparison isolates the seed. These are not full-pipeline absolute numbers to lay next to leaderboard entries; they answer "which seed should splatreg trust on real scans," nothing more.&lt;/p&gt;

&lt;p&gt;Here is one real low-overlap pair (&lt;code&gt;7-scenes-redkitchen&lt;/code&gt; 35→46, ground-truth overlap 0.10) watched end to end — the classical seed slews the fragment into the wrong basin at 151.5° error, then BUFFER-X + refine locks on at 2.0°. Both transforms are actual library outputs; the animation interpolates between real estimates, nothing is hand-posed:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fboken54e76rysfu519fp.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fboken54e76rysfu519fp.gif" alt="Three-phase animation of a real low-overlap 3DMatch pair: unaligned, wrong classical FPFH+RANSAC basin at RRE 151.5 degrees, then BUFFER-X seed plus splatreg refine locking onto the target at RRE 2.0 degrees" width="700" height="700"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The build recipe (the part you actually came for)
&lt;/h2&gt;

&lt;p&gt;BUFFER-X ships native extensions written for an older stack. Getting them to build on CUDA 12.8 / RTX 5090 (sm_120) / torch 2.11 / numpy 2.4, without sudo, took a day of archaeology. The full recipe is in &lt;a href="https://github.com/Archerkattri/splatreg/blob/main/docs/BUFFERX_BUILD_MODERN_CUDA.md" rel="noopener noreferrer"&gt;&lt;code&gt;docs/BUFFERX_BUILD_MODERN_CUDA.md&lt;/code&gt;&lt;/a&gt;; these are the walls I hit:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. &lt;code&gt;pointnet2_ops&lt;/code&gt; hardcodes dead GPU architectures.&lt;/strong&gt; Its &lt;code&gt;setup.py&lt;/code&gt; sets &lt;code&gt;TORCH_CUDA_ARCH_LIST = "3.7+PTX;5.0;..."&lt;/code&gt;, and nvcc 12.8 flat-out rejects &lt;code&gt;compute_37&lt;/code&gt;. Patch that line to your real arch (&lt;code&gt;"12.0"&lt;/code&gt; for sm_120) and install with &lt;code&gt;pip install --no-build-isolation .&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The KPConv C++ wrappers use the numpy 1.x C-API.&lt;/strong&gt; numpy 2.x removed it. The port is mechanical once you know it: &lt;code&gt;NPY_IN_ARRAY&lt;/code&gt; → &lt;code&gt;NPY_ARRAY_IN_ARRAY&lt;/code&gt;, and cast the &lt;code&gt;PyObject*&lt;/code&gt; handles to &lt;code&gt;PyArrayObject*&lt;/code&gt; everywhere &lt;code&gt;PyArray_NDIM/DIM/DATA&lt;/code&gt; is called.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. You don't need &lt;code&gt;apt install libtbb-dev&lt;/code&gt;.&lt;/strong&gt; &lt;code&gt;pip install tbb tbb-devel&lt;/code&gt; drops &lt;code&gt;tbb/tbb.h&lt;/code&gt; under &lt;code&gt;&amp;lt;venv&amp;gt;/include&lt;/code&gt;; point &lt;code&gt;CPLUS_INCLUDE_PATH&lt;/code&gt; there (plus a &lt;code&gt;--depth 1&lt;/code&gt; clone of header-only Eigen) and the wrappers compile sudo-free.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Two CUDA deps don't deserve a build at all.&lt;/strong&gt; BUFFER-X only uses &lt;code&gt;knn_cuda.KNN&lt;/code&gt; with k=1 — that's &lt;code&gt;torch.cdist&lt;/code&gt; + &lt;code&gt;topk&lt;/code&gt;. And &lt;code&gt;torch_batch_svd&lt;/code&gt; is just &lt;code&gt;torch.linalg.svd&lt;/code&gt;, which batches natively now. Tiny pure-torch shim modules on the path replace both; they ship in &lt;code&gt;docs/bufferx_shims/&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. The silent killer: the pretrained checkpoints are full-model state dicts.&lt;/strong&gt; The keys are prefixed &lt;code&gt;Desc.&lt;/code&gt;/&lt;code&gt;Pose.&lt;/code&gt;. Load them into a &lt;em&gt;submodule&lt;/em&gt; with &lt;code&gt;strict=False&lt;/code&gt; and nothing matches — you get randomly initialized weights that produce garbage seeds with no error anywhere. If your zero-shot model performs like a random-pose generator, check this first.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;splatreg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;splatreg.api&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;register&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;init&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bufferx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# zero-shot seed + LM refine
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the BUFFER-X weights or extensions are absent, &lt;code&gt;init="bufferx"&lt;/code&gt; logs a note and falls back to the classical robust seed — it never fails silently. Everything downstream (Sim(3) scale recovery, spherical-harmonic rotation via real-basis Wigner-D, pose covariance for pose graphs, merge + dedupe) is identical regardless of which seed you chose.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it doesn't do
&lt;/h2&gt;

&lt;p&gt;Zero-shot does not mean magic. Below roughly 40% retained overlap the rotation-disambiguating geometry is physically absent, and no seed fixes that — splatreg flags those cases as ambiguous rather than silently wrong-posing, and scale is unobservable under thin overlap no matter what. The 0.962/0.777 figures are seed-isolation numbers under one shared refine, not leaderboard entries; per-dataset-trained models still hold the absolute 3DMatch record and I say so in the README. The BUFFER-X path needs a real CUDA build (the recipe above) — CPU-only installs get the classical fallback. And splatreg itself registers &lt;em&gt;splats&lt;/em&gt;; if all you have are raw point clouds, BUFFER-X upstream serves you directly without any of my wrapping.&lt;/p&gt;

&lt;p&gt;Every number here has a reproduction path in &lt;a href="https://github.com/Archerkattri/splatreg/blob/main/RESULTS.md" rel="noopener noreferrer"&gt;RESULTS.md&lt;/a&gt;, and the figure/GIF generators live in &lt;code&gt;examples/&lt;/code&gt;.&lt;/p&gt;

</description>
      <category>computervision</category>
      <category>cuda</category>
      <category>python</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Your AI can stop hallucinating math: a real Lean kernel over MCP</title>
      <dc:creator>Krishi Attri</dc:creator>
      <pubDate>Thu, 02 Jul 2026 05:03:52 +0000</pubDate>
      <link>https://dev.to/kattri/your-ai-can-stop-hallucinating-math-a-real-lean-kernel-over-mcp-1eb4</link>
      <guid>https://dev.to/kattri/your-ai-can-stop-hallucinating-math-a-real-lean-kernel-over-mcp-1eb4</guid>
      <description>&lt;p&gt;I got tired of watching AI assistants confidently misquote theorems, so I built &lt;a href="https://github.com/Archerkattri/mathlas" rel="noopener noreferrer"&gt;mathlas&lt;/a&gt;: an MCP server that gives any agent a real Lean kernel, PSLQ, OEIS matching, and a 3.68M-document theorem index. No LLM inside, no API key, Apache-2.0.&lt;/p&gt;

&lt;p&gt;The premise is a strict division of labor: &lt;strong&gt;the AI is the brain, mathlas is the hands.&lt;/strong&gt; Every tool returns data — candidates, verdicts, checklists — and the agent does the judging. No tool inside mathlas ever calls an LLM, which means no tool inside mathlas can hallucinate.&lt;/p&gt;

&lt;h2&gt;
  
  
  The discipline: airtight or nothing
&lt;/h2&gt;

&lt;p&gt;Every verdict-producing tier follows one rule: return an independently-checkable fact, or an honest "nothing." Never a plausible guess.&lt;/p&gt;

&lt;p&gt;Here is what that looks like in practice — real in-process tool outputs, captured from the live server:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fexxz2ybuq4alrji3qusl.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fexxz2ybuq4alrji3qusl.gif" alt="A real mathlas tool session: verify_formal returns VERIFIED_PROOF, then REFUTED with the kernel's verbatim error, then REJECTED for a sorry hole, all from the real Lean 4.31.0 kernel; identify_constant recovers pi**2/6 to 50 digits via PSLQ" width="800" height="761"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;verify_formal&lt;/code&gt; runs the &lt;strong&gt;actual Lean 4.31.0 kernel&lt;/strong&gt;. Hand it a proposition and your Lean 4 proof and you get one of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;VERIFIED_PROOF&lt;/code&gt; — the kernel typechecked the full declaration;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;REFUTED&lt;/code&gt; — with the kernel's error message verbatim, so the agent can repair and retry;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;REJECTED&lt;/code&gt; — for &lt;code&gt;sorry&lt;/code&gt;/&lt;code&gt;admit&lt;/code&gt; holes (Lean itself exits 0 on a sorried proof; mathlas scans the source &lt;em&gt;and&lt;/em&gt; the kernel's &lt;code&gt;sorryAx&lt;/code&gt; diagnostics, so you can't sneak a hole past it);&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;UNDETERMINED&lt;/code&gt; — when the toolchain is missing, an import can't resolve, or the 60 s cap hits. An honest shrug, never a fake verdict.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On the numeric side, &lt;code&gt;identify_constant&lt;/code&gt; and &lt;code&gt;verify_numeric&lt;/code&gt; use PSLQ plus an independent high-precision re-evaluation (50–51 digits). Type in &lt;code&gt;1.6449340668482264...&lt;/code&gt; and it hands back &lt;code&gt;pi**2/6&lt;/code&gt;, re-verified — or nothing at all. Measured false-positive rate across every tier (numeric, sequence, Ramanujan-style relations): &lt;strong&gt;zero&lt;/strong&gt;. Structureless inputs produce zero false hits, 8 out of 8 times we tried to bait it. Full tables with reproduction commands live in &lt;a href="https://github.com/Archerkattri/mathlas/blob/main/RESULTS.md" rel="noopener noreferrer"&gt;RESULTS.md&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Twelve tools, one pipeline
&lt;/h2&gt;

&lt;p&gt;The tools compose into a workflow the agent drives:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;search_existing_math → applicability_checklist / mapping_scaffold → (AI judges) → verify_numeric / verify_formal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The retrieval side is &lt;code&gt;search_existing_math&lt;/code&gt;, served from a 3,683,428-document dense + BM25 + RRF index (the text side is &lt;a href="https://huggingface.co/datasets/kattri15/mathlas-corpus" rel="noopener noreferrer"&gt;open on Hugging Face&lt;/a&gt;). &lt;code&gt;applicability_checklist&lt;/code&gt; is the tool I'm proudest of and the one nobody else ships: it decomposes a theorem's hypotheses into atomic preconditions the AI verifies one by one — the guardrail against applying Banach's fixed-point theorem to an incomplete space. Then there's &lt;code&gt;identify_sequence&lt;/code&gt; (exact OEIS term-match), &lt;code&gt;search_formal_math&lt;/code&gt; (proxies Loogle + LeanSearch for mathlib declarations, provenance-labeled), &lt;code&gt;conjecture_relation&lt;/code&gt; (Ramanujan Machine-style PSLQ over a rich basis), a sandboxed &lt;code&gt;funsearch&lt;/code&gt; harness, and &lt;code&gt;add_finding&lt;/code&gt;, which matters for the benchmark below.&lt;/p&gt;

&lt;h2&gt;
  
  
  The benchmark, with its caveat up front
&lt;/h2&gt;

&lt;p&gt;On TheoremSearch's own 110 human-written queries, mathlas with its self-augmenting web loop scores &lt;strong&gt;59.1% theorem-level Hit@20 (65/110)&lt;/strong&gt; against TheoremSearch's 45.0%. Sounds great. Here is the part you should read before quoting it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is a loop-vs-static comparison, not a corpus-vs-corpus one.&lt;/strong&gt; Corpus-only, mathlas's baseline on this benchmark is &lt;strong&gt;10.0%&lt;/strong&gt; — TheoremSearch withheld ~85% of their private 9.2M corpus (the non-redistributable arXiv-licensed papers), so 95 of the 110 target papers are unreachable for &lt;em&gt;any&lt;/em&gt; open system. What the 59.1% measures is the &lt;code&gt;add_finding&lt;/code&gt; loop: the agent web-finds each missing statement, embeds it with the same Qwen3-Embedding-8B, and fuses it into the live index at runtime. TheoremSearch's 45.0% is a static system answering from its full private corpus. The honest headline is "a validated writeback loop repairs an open system's coverage gap at runtime," not "my index beats theirs." The math domain is the right place for such a loop precisely because the write-back candidate can be deterministically checked (&lt;code&gt;verify_numeric&lt;/code&gt; / &lt;code&gt;verify_formal&lt;/code&gt;) before it's trusted.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Flffiommoev4zmksqqpqc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Flffiommoev4zmksqqpqc.png" alt="Theorem Hit@20 on TheoremSearch's 110-query benchmark: mathlas + self-augmenting web loop 59.1, TheoremSearch 45.0, Google 37.8 (paper-level), Gemini 3 Pro 27.0, ChatGPT 5.2 19.8, mathlas corpus-only baseline 10.0" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Reproduce it with &lt;code&gt;benchmarks/webaug_110_bench.py&lt;/code&gt; in the repo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it in one line
&lt;/h2&gt;

&lt;p&gt;With &lt;a href="https://docs.astral.sh/uv/" rel="noopener noreferrer"&gt;uv&lt;/a&gt; installed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add mathlas &lt;span class="nt"&gt;--&lt;/span&gt; uvx mathlas-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it — Claude Code now sees twelve tools. Plain pip works too (&lt;code&gt;pip install mathlas-mcp&lt;/code&gt;), Cursor or any MCP client can point at the same stdio command, and if the official &lt;code&gt;mcp&lt;/code&gt; SDK isn't installed the server falls back to a dependency-free stdio JSON-RPC implementation, so it always runs. It's also on the official MCP registry as &lt;code&gt;io.github.Archerkattri/mathlas&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it doesn't do
&lt;/h2&gt;

&lt;p&gt;mathlas does not write proofs — the generator/verifier split is absolute, so the kernel checks &lt;em&gt;your&lt;/em&gt; proof and reports exactly why it fails, but the repair is on you (or your agent). Corpus-only retrieval will not beat TheoremSearch on their benchmark; the 10.0% baseline is a licensing-bounded floor and I report it as such. Two tools degrade without optional local data: &lt;code&gt;identify_sequence&lt;/code&gt; wants a local OEIS copy and &lt;code&gt;verify_formal&lt;/code&gt; wants a Lean toolchain — without them you get a clear "not available," never a fake answer. The full-quality index needs the Qwen3-Embedding-8B encoder, which is not laptop hardware; there are measured quantized and 0.6B tiers that trade 7–9 points of recall for running on 4 CPU threads, documented with their exact costs. And it's not a CAS — if you want to symbolically massage an expression you gave it, sympy is the right tool; mathlas finds, scopes, and verifies &lt;em&gt;existing&lt;/em&gt; math.&lt;/p&gt;

&lt;p&gt;Code, benchmarks, and every number's reproduction command: &lt;a href="https://github.com/Archerkattri/mathlas" rel="noopener noreferrer"&gt;github.com/Archerkattri/mathlas&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>math</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
