shunta hayashi

Posted on May 13

Self-dogfooding: using my own AI-PR scanner to ship a fix to ONNX

#opensource #github #contributing #security

Note: This article documents work performed with AI assistance (Claude Sonnet 4.6 via Claude Code), including the original bug analysis, the pre-submission review that prompted the path change, and the PR that was ultimately submitted. All technical claims are verified against the ONNX source tree and the public PR.

The hook: a real bug that was never going to ship as an advisory

The bug was straightforward once I saw it. In onnx/utils.py, a helper function called _tar_members_filter uses a plain str.startswith() call to validate that a tar archive member lives inside the intended extraction directory:

# onnx/utils.py  (simplified)
abs_base   = os.path.abspath(base)
abs_member = os.path.abspath(member_path)
if not abs_member.startswith(abs_base):   # <-- no os.sep guard
    raise RuntimeError("traversal detected")

The problem is that startswith is a string operation, not a path operation. Given a base directory of /home/user/.onnx/models, a crafted archive member resolving to /home/user/.onnx/models_evil/pwned.txt passes the check: the string "models_evil" begins with the string "models". A separator guard — startswith(abs_base + os.sep) — closes the gap. Without it, files can be written outside the extraction directory on Python 3.10 and 3.11, the versions where the fallback filter is active.

I found this via static analysis. The fix was one line. I had a working proof-of-concept. My first instinct was to head straight to a bug bounty platform and file an advisory.

I paused instead — and that pause changed what happened next.

The reviewer that changed my mind

Before submitting anything, I ran a pre-submission review pass using an LLM agent configured for deep, adversarial analysis. I think of this as a second opinion before publication: give it the same evidence I have, tell it to find holes, and treat the output seriously.

The technical verdict came back positive. The root cause — startswith missing + os.sep — was confirmed correct. The fix was confirmed correct. The proof-of-concept logic was independently traced and verified.

Then the report pivoted:

Two essentially identical reports already exist on the bounty platform (both self-closed by the reporters), and a third was marked Duplicate.

The reviewer had checked the platform's listing for this repository. Three prior reports — same function, same root cause, same string-comparison bug — filed in March 2026. Two were self-closed by the reporters, almost certainly after maintainer feedback. One was marked Duplicate.

The reviewer's framing was precise:

Submitting a third (fourth) variant of the same finding carries serious duplicate-judgement and reputation risk that outweighs the (real but low-reach) underlying flaw.

It also identified weaknesses I had not fully interrogated: the vulnerable code path is only reachable through the backend test runner (not any production inference path), the affected Python versions (3.10 and 3.11) represent a shrinking window, and the impact framing in my draft was more aggressive than the actual constraint arithmetic supported.

This is the part of AI-assisted research that I find most useful and most underrated: not the initial discovery, but the disciplined second-pass that asks "is this the right thing to do with the finding?"

The answer was clearly no. Filing a fourth variant of a finding that three other reporters had already abandoned, on a platform that was already under scrutiny for low-quality reports, was not a good use of anyone's time.

The real lesson: finding a bug does not mean you should file an advisory about it. The path from "this is technically wrong" to "I should submit this through a bounty platform" has several gates, and the duplicate-risk gate is one of the most expensive to fail.

The pivot: direct PR instead of advisory

Once the bounty-submission path was off the table, the decision was simple. The fix itself was uncontroversially correct. Changing startswith(abs_base) to startswith(abs_base + os.sep) and abs_member != abs_base is the canonical pattern. Python's own PEP 706 was written to address this class of tarball extraction vulnerability. The code needed the fix regardless of how many people had noticed the gap.

A direct pull request solves the actual problem without any of the duplicate risk. The PR gets reviewed on the merits of the code change. Credit is attached to the commit. The maintainer relationship is positive rather than transactional.

The principles I reached for in this decision:

If the fix is uncontroversial, the PR is the highest-EV path. Advisory platforms add value when a finding needs coordinated disclosure, embargo, or cross-maintainer coordination. A one-line correctness fix to a fallback filter does not meet that bar.
Duplicate risk on advisory platforms is a reputation cost, not just a process cost. Prior closed reports signal that maintainers have already processed this class of finding. Submitting again without materially new evidence is noise.
The goal is fixed software, not credited findings. If the code gets patched, the outcome is correct whether or not a bounty number is attached.

With that settled, I moved to the contribution workflow.

Pre-fork due diligence with my own tool

This is the part I find satisfying to document: I used the tool I wrote about in the first article in this series to scan the ONNX repository before I forked it.

gh-pr-trust-scan is a Python CLI that checks a repository for automated trust-gate workflows, explicit AI-ban policies in contribution documentation, and rejection-signal labels. The question it answers is: "Will this project reject an AI-assisted PR on policy grounds before anyone reviews the code?"

Running it against onnx/onnx takes a few seconds:

gh-pr-trust-scan onnx/onnx

Scanning onnx/onnx ...

Repo:    onnx/onnx
Verdict: SAFE

Findings:
  [LOW   ] No explicit AI ban label found

Stats:
  Last commit: 1 day ago
  Open PRs: 34
  Closed-no-merge PRs (last 30): 7

Verdict: SAFE. One LOW finding — absence of any rejection-signal label, which is informational rather than a warning. No automated trust-gate workflows, no AI-ban policy language in CONTRIBUTING.md or the PR template, no labels associated with automated rejection.

This is exactly the confirmation I needed before investing time in the implementation. ONNX is a well-maintained project with active CI, a clear DCO sign-off requirement, and no explicit policy against AI-assisted contributions. The scan took less time than reading CONTRIBUTING.md manually.

This is what "dogfooding" looks like in practice. I built the tool to avoid wasted contribution effort, and using it before my own fork means I'm running the same workflow I recommend to others. The SAFE verdict also gave me a calibration data point: the tool correctly identified ONNX's DCO sign-off requirement without misclassifying it as a rejection signal.

The PR

Branch: fix/tar-traversal-separator-guard. I named it around the mechanical fix rather than the vulnerability class — advisory-flavored branch names tend to set an adversarial frame before anyone reads the code.

The change itself is minimal:

# onnx/utils.py — before
if not abs_member.startswith(abs_base):

# onnx/utils.py — after
if not abs_member.startswith(abs_base + os.sep) and abs_member != abs_base:

The and abs_member != abs_base clause handles the edge case where the member path resolves to exactly the base directory itself, which should be allowed.

I also moved abs_base = os.path.abspath(base) outside the loop. The original code recomputed the same value on every iteration — a minor performance fix that also makes the intent clearer.

Three regression tests cover the cases that matter:

# onnx/test/test_tar_members_filter.py (representative cases)
def test_normal_member_allowed(self):
    # tar member inside base → passes filter

def test_sibling_prefix_rejected(self):
    # "../models_evil/pwned.txt" style bypass → raises RuntimeError

def test_exact_base_dir_allowed(self):
    # member resolving exactly to abs_base → allowed

The PR description includes an explicit AI disclosure ("researched and drafted with AI assistance via Claude Code"), zero use of advisory-escalating language (no "CVE," "exploit," "RCE," or "vulnerability" in the PR title or summary), and a DCO sign-off on every commit as ONNX's CONTRIBUTING.md requires. The fix is framed as what it is: a correctness improvement to the path-containment check in a fallback branch.

The PR is at: https://github.com/onnx/onnx/pull/7948

Lessons

The full loop in this session looked like this:

static analysis → bug found
     ↓
proof-of-concept → bug confirmed
     ↓
DEEP_REVIEW → duplicate risk identified, advisory path closed
     ↓
pivot decision → direct PR
     ↓
gh-pr-trust-scan onnx/onnx → SAFE verdict, fork with confidence
     ↓
implementation + tests → PR submitted

Start to PR submission happened inside a single working session. The only reason the loop closed cleanly was the pre-submission review step: without it, I would have filed an advisory that duplicated three prior reports and likely produced no useful outcome.

A few things I am taking forward:

Finding a bug and deciding what to do about it are different skills. Static analysis and proof-of-concept work are pattern recognition problems. Deciding whether to file an advisory, open a PR, or do nothing requires understanding the platform dynamics, the prior report history, and the realistic impact of the finding. These are judgment calls that benefit from a structured second-opinion process.

Pre-submission review is cost-effective at any scope. The review pass took less than the time I would have spent polishing the advisory before submission. Catching duplicate risk at the review stage costs essentially nothing; catching it after submission costs reputation.

Using a tool on your own work produces better feedback than testing it on examples. Running gh-pr-trust-scan against onnx/onnx gave me concrete signal about edge cases — how the tool handles DCO requirements without flagging them as AI-hostile — than any synthetic test scenario. Running it on a real, active OSS project confirmed that severity calibration is sensible.

What comes next: monitor the PR for maintainer feedback. If the review surfaces a preference for the pathlib.Path.relative_to() formulation over startswith + os.sep, that goes into the code and serves as a useful style data point for future contributions.

This article was researched and drafted with AI assistance (Claude Sonnet 4.6 via Claude Code). Tool behavior described matches the gh-pr-trust-scan codebase as of May 2026 (commit d9b365a). The ONNX PR linked above is real and submitted under the same author identity (taiman724 on GitHub, with DCO sign-off).