Grace

Posted on May 21

Rethinking Open Source Contribution in the Age of AI Agents, featuring vLLM Core Maintainer Roger Wang at MLSys'26

#vllm #ai #machinelearning #llm

Roger is a software engineer focused on machine learning research and systems. He is a core maintainer of vLLM and the lead maintainer of vLLM-Omni, where he works on infrastructure for large multimodal and omni-modality models. He also recently co-founded Inferact, a startup focused on making AI inference cheaper and faster while helping grow vLLM as a major AI inference engine.

The talk was useful because Roger was speaking from direct experience maintaining open source AI infrastructure at a time when AI coding agents are changing how people contribute.

One of the clearest points was that the maintainer's inbox has changed. For vLLM, weekly new pull requests increased a lot from early 2025 to 2026, with visible spikes around major model and coding agent releases. More people can now generate code and open pull requests quickly.

That creates a real burden for maintainers. The hard part is not only reviewing whether the code works. Maintainers also need to understand whether the contributor understands the system, whether the change solves the right problem, and whether the person will stay involved after the pull request is opened.

A line from the talk captured this well:

Talk and code is cheap, show me you really care.

That was my main takeaway. In open source, a useful contribution is more than a code diff. It requires reading the codebase, understanding the project direction, explaining the design clearly, and taking responsibility for the change.
Roger highlighted a few things contributors should focus on:

Understand the system
Pick the right problem at the right scale
Communicate clearly
Have a sense of ownership and responsibility

If AI agents make it easier to submit code, then contributors need to show stronger signals that they understand what they are changing.

For maintainers of critical infrastructure, Roger also pointed to a few changes that matter. Standards are higher now. Projects need clearer non-goals, including when "fork it as a plugin" is a valid path. Design decisions should be reviewable, not just the code diff. Reliability matters more, which means projects need to invest more in continuous integration. Reviewer time should also be spent carefully, especially on contributors who are learning the system and likely to keep contributing.

Another point that stood out was the changing pipeline of open source talent. If the new on-ramp to open source is "prompt an agent," maintainers may see more contributors who have not deeply read the codebase. That creates risk. It also creates an opportunity to be clearer about what good contribution looks like.

The bar for human contribution is getting higher and clearer. Writing plausible code is less of a signal than it used to be. The stronger signals are system understanding, good judgment, clear communication, and trust.

I left the talk thinking that AI agents will change open source contribution, but they will not remove the human part. If anything, the human part becomes more important. The best contributors will be the people who can understand the system, communicate with maintainers, and take responsibility for the work beyond the first pull request.

For anyone trying to contribute to open source now, especially in AI infrastructure, the practical advice is simple:

Ask real questions.
Meet people outside your usual circle.
Support other contributors, especially during poster sessions and discussions.
Read the codebase before asking an agent to change it.
Show that you care about the project, not just the pull request.

Top comments (6)

Vadym Arnaut • May 21

What worked for us on a tiny OSS LMS: making the "we want contributors, not just PRs" expectation visible in CONTRIBUTING.md before anyone opens one. The first community PR (a scroll-to-top button) wasn't valuable because it was complex. It was valuable because the contributor took a real code review, pushed clean follow-ups, and stayed in the issue thread afterward. The "show me you really care" framing maps almost exactly to that.

The signal isn't the diff size; it's whether the person is still around two weeks later.

Grace • May 22

Yes, this is a great way to frame it. The strongest signal is often not the size of the first PR, but whether someone responds well to review, follows through, and keeps engaging after the initial contribution. Making that expectation visible upfront in CONTRIBUTING.md is a practical way to set the tone early.

VoltageGPU • May 22

It's great to see more focus on optimizing inference throughput—this is crucial as we push ML workloads to the edge and into real-time systems. At VoltageGPU, we've seen how tightly coupling scheduling with hardware-specific memory hierarchies can yield surprising gains, especially when dealing with dynamic batch sizes.

Grace • May 22

Thanks for sharing this. I agree that inference throughput becomes much more important as ML workloads move closer to real-time and edge use cases. The point about coupling scheduling with hardware-specific memory hierarchies is interesting too, especially for dynamic batching where small implementation details can have a big impact on utilization. Would be curious to learn more about what patterns you’ve seen work well at VoltageGPU.

Mininglamp • May 27

vLLM's growth proves that high-quality agent contributions work when they're scoped correctly. The pattern that scales: agents handle the mechanical parts (test coverage, doc updates, format fixes) while humans own the architectural decisions. Projects that figure out this division of labor early will outpace the ones still debating whether to allow agent contributions at all.

Harjot Singh • May 31

Rethinking OSS contribution in the age of agents is a timely topic, because maintainers are about to face a flood of agent-generated PRs, and the bottleneck shifts from writing code to reviewing it with trust. The tension is real: agents can lower the bar to contributing (great for volume, accessibility), but they also lower the bar to plausible-but-wrong contributions, code that looks right, passes a superficial glance, and quietly isn't, which is exactly the kind of thing that burns maintainer time. So the contribution norms that'll matter are the ones that let a maintainer trust-but-verify efficiently: a PR should carry its provenance and its evidence, what the agent did, why, and proof it works (tests, the reasoning, the session), so review becomes checking verifiable claims rather than re-deriving intent from a diff. The healthy version isn't agents replacing contributors, it's agents doing the breadth (drafts, tests, triage) with humans owning the judgment and the merge, the irreversible step stays gated by a person. Agents raise contribution volume; the answer is making contributions auditable so review can keep up. That make-the-work-verifiable-so-trust-scales instinct is core to how I think about Moonshift. From Roger's maintainer view, is the bigger concern the volume of agent PRs, or the difficulty of trusting/verifying them at review time?