DEV Community: Charlie Tonneslan

A pagination bug that returned zero rows, and the saturating add that fixes it

Charlie Tonneslan — Tue, 19 May 2026 09:53:52 +0000

Somebody filed a bug on cosmos-sdk a few months back (#25006) that's the kind of thing I love finding. It's reproducible against a public RPC endpoint, the symptom is unambiguous, and the cause is two lines of code that any of us could have written.

The repro is one URL:

https://cosmos-rest.publicnode.com/cosmos/slashing/v1beta1/signing_infos?pagination.offset=12&pagination.limit=0xFFFFFFFFFFFFFFFF

That returns zero results. Change the limit to anything small, like 0x12, and you get rows back. The offset is the same in both cases. The data is the same. The only thing that changed is asking for "give me a lot of rows starting at row 12" instead of "give me a few rows starting at row 12."

If you've been writing software long enough you can already smell what this is.

The code

Paginate lives in types/query/pagination.go. It's the function that backs every paginated REST endpoint in the SDK, including signing_infos. The interesting part of the body:

end := pageRequest.Offset + pageRequest.Limit

for ; iterator.Valid(); iterator.Next() {
    count++

    if count <= pageRequest.Offset {
        continue
    }
    if count <= end {
        err := onResult(iterator.Key(), iterator.Value())
        ...
    } else if count == end+1 {
        nextKey = iterator.Key()
        ...
    }
    ...
}

Offset and Limit are both uint64 fields on the gRPC request type. They come from the user. So when somebody asks for offset=12 and limit=0xFFFFFFFFFFFFFFFF, the computation on line 1 is

end = 12 + 0xFFFFFFFFFFFFFFFF
    = 0x1_0000_0000_0000_000B  (in real arithmetic)
    = 0x0000_0000_0000_000B    (after the high bit wraps)
    = 11

So end is now 11. The loop walks the iterator, increments count each step, and falls into one of three branches:

count <= 12: skip (we're still inside the offset)
count <= 11: emit (we're inside the page) — but this can never be true once count > 12, and count > 12 happens immediately after the first branch lets you through
count == 12: would set nextKey once — but that branch only fires after the second branch fires, and the second branch never fires

So you skip the first 12 rows, then on row 13 both inner conditions are false, and the loop just spins through the rest of the iterator doing nothing. The response is an empty array of rows and an empty next-key, and the user gets back zero results when they asked for everything past row 12.

The bug is not that the user did something silly. The bug is that integer overflow turned a sensible "give me everything past the offset" query into something the loop interprets as "give me nothing." There's no error, no warning, no rate limit hit. Just an empty list.

The fix is small

Saturating addition. If the sum would overflow, clamp to the max value the type can hold:

end := pageRequest.Offset + pageRequest.Limit
if end < pageRequest.Offset {
    end = math.MaxUint64
}

That's the entire fix. a + b < a is the standard Go test for unsigned overflow. If it's true, you've wrapped, so reset end to the largest possible value. The loop now treats "huge limit" the same way it would treat "asked for everything," which is exactly what the caller said they wanted.

The else if count == end+1 branch has a small adjacent worry. If end is now MaxUint64, then end+1 wraps to 0. But count starts at 0 and increments before any comparison, so count == 0 can never fire. The branch silently never executes, which is fine — it just means we don't set nextKey, and the response has no next-page pointer. For a query that's asking for the entire tail of the dataset, no next page is the correct answer.

So one new branch, one stays a no-op, and we're done. The whole patch is six new lines plus a math import.

The regression test

The existing test for Paginate uses a bank-store fixture with 235 entries. Adding the overflow case to it was the most useful place to put a regression test, because anyone reading the test file then sees the limit-0xFFFF... case sitting next to "verify paginate with offset and limit" and "verify paginate with offset greater than total results." It reads like a documented edge case rather than a one-off.

s.T().Log("verify offset+limit overflow returns the page instead of nothing")
pageReq = &query.PageRequest{Offset: 12, Limit: 0xFFFFFFFFFFFFFFFF, CountTotal: false}
request = types.NewQueryAllBalancesRequest(addr1, pageReq, false)
res, err = queryClient.AllBalances(gocontext.Background(), request)
s.Require().NoError(err)
s.Require().Equal(res.Balances.Len(), numBalances-12)

numBalances - 12 = 223. Before the fix this test fails with res.Balances.Len() == 0. After the fix it passes. Cheap to write, will catch any regression to this code path forever.

What this generalizes to

The reason I'm writing about this is that I've seen the same shape of bug enough times that it's worth pulling out as a pattern.

Whenever you have

a counter or position that you advance one step at a time, and
a stopping point computed by adding two user-supplied numbers,

you have to think about overflow. The naive a + b works fine for the obvious "user picks a sane number" case. It also works fine for the obvious "user picks a number bigger than what's stored" case, because most stores have far fewer than MaxUint64 rows. The case it doesn't handle is the one where the user picks a number large enough that a + b wraps inside the integer type.

The Cosmos endpoint had no incentive to defend against this because there's no public attacker payoff. The worst thing a bad actor can do here is "get zero results when they should have gotten lots." But if you're a third-party indexer that's paginating across a long tail and your client library passes through a huge limit hoping to grab everything at once, you might quietly miss thousands of rows on certain endpoints with no error. That's the kind of data-quality bug you find six months later via a discrepancy report, and by then you have to rebuild your indexes.

Three places I'd look for similar bugs:

Any RPC handler in any chain that takes offset + limit. Cosmos isn't the only one with this pattern, and the same fix applies wherever the math is done in uint64.
Anywhere a Go service computes start + length for byte-range reads. HTTP Range: headers do this constantly. Saturating arithmetic is the right answer there too, although in practice most HTTP servers reject silly ranges before they hit your code.
Any place that pre-computes the upper bound of a for i = 0; i < end; i++ loop with end := base + n. If base and n come from anywhere a user can touch, you need to know what overflow does to the comparison.

The Go standard library has math.MaxUint64, and in 1.20+ there's math/bits.Add64 if you want to detect the overflow explicitly. Both are fine. The check if end < pageRequest.Offset is the most common in the SDK style and was what I went with.

What landed

The PR is #26430. Six lines of code, one regression test, sat in the queue for a week, picked up by the cosmos-sdk team and merged into the main paginate path. The fix backports cleanly to v0.50 and v0.47 too, but those are separate PRs.

The thing I want anyone reading this to take away is the pattern, not the specific bug. Every gRPC paginate handler I've looked at since has the same shape, and the same one-line guard works on all of them. Worth pasting into your own code review checklist.

What it took to put six cities' affordable housing data on one map

Charlie Tonneslan — Sun, 17 May 2026 17:16:35 +0000

I had a screen open with NYC's HPD pipeline dataset on the left and San Francisco's MOHCD dataset on the right, and I was trying to answer what should have been a simple question. Who's building more housing for low-income renters per capita right now, New York or SF.

The columns didn't match. NYC's records have a "borough" and an "income tier" with five buckets. SF's records have a "neighborhood" and an "AMI bracket" with three. NYC tracks construction type as "preservation" vs "new construction." SF calls it "rehab" vs "ground-up." Both have a unit count, but NYC bundles rental and homeownership into one column and SF splits them. The two cities are notionally measuring the same thing. The column-by-column overlap is maybe forty percent.

That afternoon turned into the project. I have a working version now: six cities (NYC, SF, LA, DC, Chicago, Philadelphia), about 6,500 affordable housing projects on one map, shared filters across cities, a real PostGIS-backed gap analysis, and the answer to my original question, which I'll get to. The repo's at github.com/c-tonneslan/groundwork. This is what it actually took.

The honest part first. There is no canonical "affordable housing" schema. Every city's housing department made up their own, on their own timeline, for their own internal reasons. NYC's HPD has been collecting unit-level data since 1987 and the schema reflects three decades of policy changes. SF's MOHCD has done the same but with different priorities. LA's HCID rolls things up differently again. DC publishes a tidy table that throws away half the detail. Chicago publishes a list of projects with no completion dates at all. (I'll come back to that one.)

So normalization is the entire project, basically. You pick a target schema, you write a loader per city, you accept that some columns are going to be null for some cities. My target schema lives in a projects table with the columns you'd expect: name, address, lat, lng, units, unit_mix, income_tier, construction_type, start_date, completion_date, funding_source, city_id, external_id. The loaders are one Node script per city in scripts/load-*.mjs. Each one maps that city's API onto the shared shape, fills the columns it can, leaves the rest null, and upserts on (city_id, external_id) so re-running it doesn't duplicate.

The mapping work itself is mostly boring. This city's borough becomes our area_id. This city's tot_units becomes our units. The interesting stuff is where the mappings don't exist. NYC tracks income tier in five bins, SF in three, LA in something else again. There's no faithful translation. So I picked the loosest common denominator (extremely low, very low, low, moderate, middle, other) and forced each city's bins into the nearest match, with an income_tier_original column that preserves the source's exact label so you can audit. The choropleth on the map uses the normalized column. The detail page shows both.

Two things from that surprised me. The bigger one was that admitting what's missing matters more than getting everything right. Every page on the live site has a data-quality footnote saying when this city's dataset was last updated, what's missing, and what assumptions the normalization made. A reader who actually cares about housing policy in DC versus LA will trust a tool that admits it forced three income bins into five. The reader who doesn't care isn't reading footnotes anyway.

The smaller surprise, which I almost dropped to keep the data tidy, was that the city with the worst data is sometimes the most useful one to include. Chicago's affordable rental inventory doesn't ship completion dates. None of the production-over-time charts work for it. Including Chicago anyway, and being upfront about the limitation, is more useful than dropping it. A reader in Chicago can still use the map and the per-project detail. A reader doing a national comparison gets to see how big the gap is between cities that publish good data and cities that don't.

Most of groundwork is plumbing. But there's one query that does the thing I built the project to do, which is to surface where the supply-demand mismatch is worst. For every census tract in a city, count the rent-burdened households (renters paying more than 30% of income on housing, from ACS 5-year), count the affordable units within 1 km of the tract centroid, order by the ratio. Worst-served tracts at the top.

In PostGIS this is one query:

SELECT
  t.tract_id,
  t.name,
  t.rent_burdened_households,
  COUNT(p.id) FILTER (
    WHERE ST_DWithin(t.centroid::geography, p.geom::geography, 1000)
  ) AS nearby_units,
  t.rent_burdened_households::float / NULLIF(
    COUNT(p.id) FILTER (
      WHERE ST_DWithin(t.centroid::geography, p.geom::geography, 1000)
    ), 0
  ) AS burden_per_unit
FROM civic.tracts t
LEFT JOIN civic.projects p ON p.city_id = t.city_id
WHERE t.city_id = $1
GROUP BY t.tract_id, t.name, t.rent_burdened_households, t.centroid
ORDER BY burden_per_unit DESC NULLS LAST
LIMIT 25;

ST_DWithin with a geography cast does the meters-native radius check. The FILTER clause lets the same aggregate count once with a spatial constraint without shuttling rows out of Postgres to filter in Node. The whole thing runs in about forty milliseconds on six cities' worth of data.

What I'd want a developer who's never used PostGIS to take from this is that the spatial filter has to happen at the database, not in your application code. The temptation is always to pull all the projects, pull all the tracts, and do the within-radius check in a for-loop in your service layer. That works for two cities. It doesn't work for six. It really doesn't work the moment you put a 1 km radius slider on the page and the user starts dragging it.

The other thing I had to figure out, which civic-data tutorials rarely touch, is that you can't compare across cities until you've normalized to population. The first version of the map ranked tracts by raw rent-burdened household count. NYC's outer boroughs dominated the top. So did LA County. Of course they did, they're huge. So I added a population column on tract (ACS 5-year totals), a per-10k field on the API responses, and a toggle on the map between raw and per-capita. Per-capita re-ranks everything. Larger wealthier neighborhoods drop off the top. Dense smaller neighborhoods rise.

The thing nobody mentions, that I had to figure out the hard way, is that per-capita on residential population has a problem of its own. Some places (the Loop in Chicago, downtown DC, midtown Manhattan) have small residential populations but huge daytime populations of workers, tourists, hospital patients. A per-capita-by-residents metric makes them look fine. They aren't fine. The Loop has almost no affordable housing because almost nobody lives there full time. Per-capita on residential population is correct for who-lives-there questions and wrong for who-needs-it questions. I lean on the residential version and note the caveat on the methodology page, but the right answer is to use both.

What about the question I started with, NYC versus SF per capita. I'll let people who want to load the data look for themselves. Two things I noticed, though. The first is that per-capita is rarely the same answer as raw. The second is that the gap between cities that publish complete data and cities that don't is bigger than the gap between cities themselves. NYC looks bigger than SF in raw numbers, of course it does. But Chicago's missing dates are a bigger missing piece than any of the headline city-vs-city numbers ever show.

The reason I built this isn't that I want everyone to use my specific tool. It's that comparing across cities should be possible from any laptop and most of the time it isn't, and that's a worse problem than the tool is. The work of normalizing is unglamorous and it's the whole project. The PostGIS query is one query. The data normalization is the rest of the year. If you're a junior councillor's staffer the day before a hearing trying to spot-check a number your boss is about to quote, the tool you wanted was someone else's normalization work. That's what civic data is. Most of it is making other people's work possible.

Code's at github.com/c-tonneslan/groundwork. The Philadelphia-only sibling project (same PostGIS schema, deeper on a single city: council district briefs, displacement signals from L&I demolition permits, email alerts on new projects within a saved radius) lives at civic-philly.vercel.app.

Building a linter for the bugs AI agents actually make

Charlie Tonneslan — Sun, 17 May 2026 17:12:03 +0000

I lost an hour last Tuesday to a function that didn't exist. The agent had written what looked like fine Postgres code, db.QueryRowContext with a context and a query string and a couple of args. It compiled. Wouldn't run. Took me forty minutes to work out it had used db.QueryRow (no context, different signature) inside something it called QueryRowContext, and was handing five things to a function that wanted three. The build error was clear enough, in hindsight. What wasted my hour was that it looked like a hundred other build errors I'd seen, and I kept reading it as a typo I could fix in two seconds.

There's a number that keeps making the rounds, that a majority of developers now say they spend more time debugging code their AI assistant wrote than debugging code they wrote themselves. I'd argue with the methodology if I didn't feel it in my own week.

Sitting with the Tuesday bug, I started cataloging. It had a shape, and so did most of the bugs I'd been hitting. Hallucinated method names. Right name, wrong arity. Right arity, wrong types. A constant that got renamed three versions back and now exists only in the agent's training data. They cluster. They're not random. So I went looking for a Go linter that catches them, and when I couldn't find one I wrote it.

It's a Go CLI called vouch. The first thing it does, the thing that's working as of this week, is read the output of go build and tell you whether your failure looks like an AI hallucination or a normal-person bug. That distinction matters more than I'd expected, and the way I built the detector is dumber than I'd expected, so this is about both.

Go already has good linters. staticcheck is sharp, golangci-lint bundles two dozen analyzers and is the standard at every company I know. They catch real bugs. What they don't catch is "your AI assistant called db.WithTimeout() and that method doesn't exist." That's a build failure, not a lint failure, and by the time the linter runs the compiler has already given up. For a human writing Go, build failures are usually typos. You fix them in five seconds and barely register them as bugs. For AI-written Go, build failures are the most common bug class by a wide margin, and they cluster into the four shapes above. You can see all of them in a single go build output, sitting next to each other, indistinguishable from a missing import. What vouch does is pull them out and label them.

I wanted the first detector to be useful without being clever, so it isn't. It's a screen scraper. It shells out to go build ./..., captures stderr, parses each line against a small set of regular expressions, and bins the error into one of four categories: undefined-symbol, undefined-method, arity-mismatch, type-mismatch. No language server, no AST walking, no model in the loop. go build and regex. The regex was ninety minutes of work. Most of the day went into the test fixtures, which is the same proportion every tool I've built has settled into.

The piece that actually matters is the --diff flag. It narrows the report to lines you changed:

$ vouch check . --diff main
internal/store/user.go:42: arity-mismatch
  ctx.WithTimeout(5 * time.Second) called with 1 arg, expected 2
  func WithTimeout(parent Context, timeout Duration) (Context, CancelFunc)

That's what turns vouch from "tell me everything wrong with this codebase" into "tell me what my agent just broke." Without the diff scope, it's noise. With it, it's a five-second pre-PR check.

I want to head off the obvious counterargument. A lot of the AI-code-review tooling I've seen does the obvious thing, throws the diff at a model and asks the model what's wrong. Sometimes that works. It also costs money per invocation, takes a few seconds per file, and gives you a different answer every time you run it. Deterministic checks are free, instant, and reproducible. If you've called ctx.WithTimeout(5 * time.Second) with one argument I don't need a frontier model to tell me you forgot a parent context. I need go build and a regex. The plan from here is to layer gopls on top for the cases the compiler alone can't catch (wrong arg order on signatures that happen to type-check, deprecated APIs), and only reach for a model at the very end, narrowed to a region the cheap checks already flagged. That's the inverse of how most of this tooling is shaped today, and the inverse is right.

The bug that actually pushed me from thinking about building vouch to building it wasn't even in the four-bucket bin. I was helping an agent put together a small Go service, and it produced this:

db, err := sql.Open("postgres", dsn)
if err != nil {
    return err
}
defer db.Close()

rows, err := db.QueryContext(ctx, "SELECT id FROM users")
if err != nil {
    return err
}
defer rows.Close()

Compiles. Runs. Wrong in a way I didn't catch for thirty minutes, because there's no db.Ping() after sql.Open. The first failure mode isn't a connection error at startup, it's a panic deep inside QueryContext. Classic. Mirrors a thousand Stack Overflow examples but skips the half they leave implicit. vouch doesn't catch this one yet. It's a pattern-incompleteness bug and it lives further up the difficulty curve. I started with the four-bucket detector because building real coverage on the easy class first is how you find out if the rest is worth chasing.

Which gets to the part nobody else is doing. If I tell you my AI linter catches AI bugs, you should ask me how often. Precision, recall, false positive rate against an off-the-shelf staticcheck pass on the same code. Every "I built an AI code reviewer" project I've come across skips that question. They show one screenshot of one bug. They don't tell you how often the tool cries wolf. The next thing I'm building isn't the next detector, it's a real eval harness, fifty-plus real-world AI-authored PRs pulled from public GitHub history (you can find them by searching for the Co-Authored-By: Claude trailer, Cursor's metadata, Devin's PR titles, or Sweep's signature), labeled for whether they introduced bugs, and a detection-rate number to put in the README.

I had a moment last week where I almost started the api-shape detector before I'd ever run vouch against a real codebase. Would have been a mistake. The thing that earns a tool the right to keep growing is showing that its first claim is actually true. Code's at github.com/c-tonneslan/vouch.

What I learned opening my first sixty open source pull requests

Charlie Tonneslan — Sun, 17 May 2026 17:11:27 +0000

Twelve days ago my GitHub account had zero contributions on it. Not zero this year, zero ever, because I'd deleted my old account in a fit of housekeeping and started fresh from a new email. Today there are about sixty pull requests open or merged across twenty-something repos in Go, Rust, Python, and TypeScript. About half are merged. A few got rejected (one for reasons I'll get to). The rest are in review.

The version of this I want to write is the version I wish I'd read at the start. Not "fork the repo, read CONTRIBUTING.md, be respectful," you already know that. The version where I tell you about the things I got wrong, often, in the order they happened.

The first thing I got wrong was assuming the "good first issue" tab was a green field. It isn't. Day one, I opened the Tailscale repo, filtered "good first issue" to oldest first, picked the first one with a clean repro, and had a fix in an hour. Went to push the PR, ran gh pr list, and found there was already an open PR for it dated six weeks earlier, sitting in review. This kept happening. A third of the issues I picked off that tab had a draft PR somewhere. The ones that didn't have a PR often had a maintainer comment buried in the thread saying "actually we don't want to fix this" or "blocked on another redesign, don't bother."

What saved me was a single command I now run before touching anything:

gh pr list --repo OWNER/REPO --state all --search "<keyword from issue>"

If there's already a PR you'll see it. If there's a closed PR with maintainer comments you'll learn the project's actual opinion on the issue, which is usually more useful than reading the issue itself. I skipped this once on a chi PR and ate a "duplicate of #1085, which was submitted first and merged today" comment within an hour. Annoying. Avoidable.

Read CONTRIBUTING.md before writing a line

The first PR I rushed without reading CONTRIBUTING.md was to a project with a mandatory PR template I hadn't filled out. Their github-actions bot auto-closes any PR that leaves required fields blank. It auto-closed mine in five minutes. I wasn't even on the page when it happened.

Same week, I got a PR rejected from rs/zerolog for a more subtle reason. I'd added a method that accepted a Go context.Context and made it available on subsequent log events from that logger chain. It looked like the obvious convenience; every other modern Go logging library has something similar. The owner replied within a few hours: this is semantically incorrect, the method name implies the receiver context is being decorated, your change actually modifies the receiver, that can break code that depends on the immutability of the chain. Closed. The code was correct. The test passed. The change lived inside a single function. None of that mattered, because I'd altered the contract of a public method in a stable library that other people depend on.

Two lessons from that week. Read CONTRIBUTING.md and the PR template before writing a line. Templates exist to save the maintainer's time, fill them out completely. And don't change the semantic contract of a public method in a stable library just because the change feels internally consistent. If you think a public API needs to evolve, file an issue first and let the maintainers decide. The cost of asking is small. The cost of having your refactor closed because nobody asked you to is bigger than just the time you spent.

The zerolog rejection is the one that really changed how I think about API stability. I now check whether a small refactor actually changes user-observable behavior, even if it doesn't break the test suite.

Reading code with no agenda finds better bugs than the issue tracker

There's an obvious play. Open the issue tracker, pick a labeled issue, fix it. That works, and it's what I did the first few days. It's also competitive, and limits you to bugs the maintainers have already triaged.

The PRs I'm most proud of came from reading code with no agenda. Two examples. The first was a stale-context bug in pgx, the Postgres driver for Go. I was reading the connection-fallback path because I wanted to understand how target_session_attrs=prefer-standby actually worked. Halfway through I noticed a ctx variable being shadowed inside a for-loop and then reused in a fallback branch where its deadline had already burned. Nobody had filed an issue. The maintainer merged the fix in a few hours. The second was a nil panic in Wails, the Go-to-frontend desktop framework. I was reading their app-startup path, saw that Application.Quit dereferenced an inner pointer that didn't get assigned until Run, wrote a five-line program that triggered the panic, and shipped a one-line fix.

Both were hard to find from the issue tracker because nobody knew they existed. They came from reading. That's the angle I'd push to anyone starting out. Pick a project whose code you actually use. Read one of its packages end to end without trying to fix anything. Take notes on everything that surprises you. Some of those will be bugs.

The other thing I had to retrain myself on was tests. Three early PRs sat untouched for days, and in every case the reviewer's first comment was "can you add a test?" So I started doing it by default. Even for a one-line fix, mirror the existing test pattern in the package and add a case that fails before the fix and passes after. Don't introduce a new test framework or assertion library to a file that's been using t.Errorf for years; that's an instant flag. And a regression test on its own is sometimes acceptable even when the fix needs more thought. I had a sqlx PR where the maintainer asked me to split the test out as a separate commit, and that landed first.

Commit messages in the project's voice

I read a hundred-ish commits from golang/go, tailscale, and a couple of Charm repos before writing my first PR, just to internalize the rhythm. Go projects almost always use package/path: short verb-first description, lowercase after the colon, no trailing period, around fifty characters. The body, when there is one, explains why rather than what, in plain prose paragraphs (not bullets), wrapped at seventy-two characters. First-person is normal. Opinions are normal.

Here's a real Tailscale commit body I think reads well:

The Engine watchdog wrapped every wgengine.Engine method call in a goroutine with a 45s timeout and crashed the process on timeout. It was added years ago to surface deadlocks during development, but the underlying deadlocks have long since been fixed, and even when it did fire it produced obscure stack traces (from inside the watchdog goroutine, not the original caller) without buying much.

Notice the personal history ("added years ago"), the opinion ("without buying much"), no bullet list even though it has multiple reasons. None of those things are hard on their own. Getting all of them in the same paragraph reliably is the part that takes practice.

Twenty PRs in, I noticed the quality of attention I was getting was uneven. Some maintainers reviewed within an hour. Some sat on PRs for a week. A few projects auto-closed mine before a human ever saw it. So I started keeping a list per-repo of "is this worth the next PR." A repo earns its way on by responding within a few days, having maintainers who comment substantively rather than just merging or closing, and having an issue tracker that isn't a graveyard. The names that consistently delivered fast useful review for me: Tailscale, the Charm projects (huh, log, bubbles, lipgloss), pgx. A handful of others I quietly stopped trying after a closure or two; enough of a pattern for me, not worth the activation energy to push back. Your list will look different. The point is to have one.

After fifty-something PRs, the ones that merged in under a day had basically the same shape. One logical change, not "fixed this and also cleaned up some imports while I was there." A diff under fifty lines, ideally under twenty. A test that proves the bug. A commit message that explains the why in a paragraph of plain prose. A PR body that references the issue number and adds a sentence or two about how I found the bug. No reformatting of surrounding code. The PRs that sat for a week missed at least two of those. The ones that got rejected missed all of them.

If you're sitting at zero contributions and feeling like the gap is too wide, the thing I'd say is what I wish someone had said to me. Pick one or two projects whose code you already use. Read one package end to end without trying to fix anything. Take notes. When you find something that's actually wrong, file an issue first if the project asks for one, then write the smallest possible fix with a test, then write a commit message that sounds like you've worked on the project for years. Do that ten times before you let yourself think about volume. Almost every shortcut version of this (find good-first-issues, fix typos in docs, run a script that opens fifty drive-by PRs) has either been done by someone else or is the kind of thing that doesn't teach you anything. The thing that teaches you and the thing that catches a maintainer's attention are the same thing: showing that you read the code.