anon1 anon1

Posted on Jul 2

No LLM Code in Dependencies [20:11:24]

#llm #nlp #ai

No LLM Code in Dependencies

TL;DR — The integration of Large Language Models into the software supply chain has introduced a new class of risk: opaque, unreviewed, and potentially legally precarious code hidden within third-party dependencies. Recent efforts by maintainers of critical open-source projects, such as git-annex, to audit and purge LLM-generated code highlight the fragility of current development practices. This movement underscores a growing disconnect between the perceived efficiency of AI-assisted coding and the rigorous demands of security, copyright compliance, and long-term maintainability. As the "tide" of automated generation rises, developers and businesses must recognize that reliance on unvetted AI output erodes trust in the open-source ecosystem and introduces significant, often invisible, liabilities.

Why This Matters in 2026

The year 2026 marks a critical inflection point in the history of software engineering, defined not by the capability of models themselves, but by the ubiquity of their output in the foundational layers of our digital infrastructure. For years, the narrative surrounding Artificial Intelligence in coding has been dominated by promises of productivity gains—often quantified by the hyperbolic term "10x developer." However, this narrative has collided with the harsh reality of dependency management. When an LLM generates code, it does not merely solve a problem; it injects a specific set of biases, stylistic quirks, and potential legal ambiguities into the codebase. The concern is no longer just about whether the code works, but about the provenance, intent, and ownership of that code.

The scale of this issue is exacerbated by the sheer volume of dependencies modern applications rely upon. A single enterprise application might transitively depend on thousands of libraries. If even a small percentage of these libraries contain code generated by black-box AI models without proper attribution or review, the aggregate risk becomes unmanageable. We are seeing a shift where the "cost" of development is no longer measured in human hours alone, but in the computational cost of verification and the legal cost of potential infringement. The recent backlash against unreviewed AI contributions signals that the industry is beginning to pay the price for unchecked acceleration.

One concrete number illustrates the gravity of the situation: the discovery of a single, incoherent 1489-line commit message associated with 10,000 lines of changes in a relatively modest 26,000 LOC code base. This ratio of change-to-documentation is a red flag for any seasoned engineer. It suggests that the code was not crafted with intentionality or clarity but was likely dumped into the repository en masse, possibly via an automated pipeline or an over-eager contributor relying heavily on generative tools. Such opacity makes auditing nearly impossible, turning every update into a potential minefield for security researchers and maintainers alike.

The Background

To understand the urgency of the "No LLM Code in Dependencies" movement, one must look at the gradual erosion of traditional code review norms. For decades, open-source projects have operated on a social contract built around transparency, peer review, and incremental improvement. Contributors submit patches, reviewers analyze the logic, and maintainers merge changes after ensuring they align with the project’s standards. This process, while slow, ensures accountability. However, the advent of easy-to-use AI coding assistants has disrupted this flow. Developers can now generate entire modules, refactor complex functions, or add configuration files with a simple prompt, bypassing the deep understanding that comes from manual implementation.

This disruption was not immediate. Initially, AI-generated code was viewed as a helpful assistant for boilerplate tasks. But as models became more capable, the boundary between "assistance" and "automation" blurred. Projects began receiving pull requests that were technically correct but stylistically alien, lacking the contextual nuance expected by human reviewers. In some cases, these changes were accepted because they passed tests, masking deeper issues regarding code quality and origin. The lack of explainability in AI-generated code means that when bugs arise, tracing their root cause becomes exponentially more difficult.

"The ease of generating code has outpaced the community's ability to vet it. We are seeing a situation where the volume of contributions is increasing, but the signal-to-noise ratio is plummeting. Maintainers are spending more time undoing damage than building features." — A Senior Open Source Infrastructure Engineer

The background of this crisis also involves the corporate landscape. Many large technology companies have integrated AI tools into their internal development workflows to speed up delivery. While this may yield short-term gains, it often results in codebases that are difficult for external contributors to navigate or maintain. When this internal code is eventually open-sourced or shared via dependencies, the lack of rigor propagates outward. The "No LLM Code" stance is thus a defensive measure, an attempt to preserve the integrity of shared resources in an era where the incentives for rapid, low-quality generation are high.

What Actually Changed

The primary change driven by the "No LLM Code in Dependencies" initiative is a fundamental shift in the criteria for accepting contributions to critical open-source projects. Specifically, projects like git-annex have adopted a strict policy of rejecting or reverting code identified as being generated by Large Language Models, particularly when such code lacks clear human authorship or understanding. This is not merely a preference for style; it is a security and legal safeguard. The maintainer of git-annex recently invested approximately 100 hours of work over the course of a single month to audit the project’s dependency tree, ensuring that no downstream packages contained LLM-generated code.

This effort revealed several alarming trends in the broader ecosystem. First, there is a tendency for large, unexplained changes to slip through automated testing pipelines. Second, the legal risks associated with AI training data are becoming tangible. Code generated by LLMs may inadvertently replicate copyrighted material from its training set, leading to potential infringement issues that the original project did not intend to assume. Third, the quality of such code is often inconsistent, characterized by incoherence and lack of adherence to established conventions.

Key changes and findings from this audit include:

Reversion of Unexplained Large Changes: Significant commits involving thousands of lines of code were identified and reverted in subsequent releases without detailed explanations, indicating a reactive rather than proactive approach to maintaining code integrity.
Identification of Copyright Risks: Instances were found where LLM prompts explicitly instructed the model to copy code from other projects. While some instances avoided legal trouble due to luck or minor variations, the precedent sets a dangerous pattern of potential intellectual property violation.
Audit of Dependency Trees: Maintainers are now actively scrutinizing not just direct dependencies but the entire transitive closure of libraries to ensure that the "cleanliness" of their own code is not compromised by tainted upstream sources.
Shift in Community Norms: There is a growing recognition among serious maintainers that "AI-assisted" does not mean "AI-generated." The former implies human oversight and modification, while the latter suggests automation that bypasses critical thinking. The distinction is becoming a litmus test for contribution acceptance.

These changes represent a hardening of the perimeter against automated noise. By refusing to integrate code that cannot be traced to human intent, projects are forcing a return to fundamentals: understanding the code you write, owning the decisions you make, and respecting the legal and ethical boundaries of intellectual property.

Impact on Developers

For individual developers, the implications of this shift are profound. The romanticized image of the "10x developer" who can prompt an LLM to "add fourmolu config and restyled neat format a module" and commit the result is being dismantled. While such actions may seem efficient in the moment, they carry significant downstream costs. Developers who rely heavily on blind AI generation risk producing code that is brittle, poorly documented, and legally ambiguous. When a bug arises in such code, the developer may not fully understand the underlying logic, making debugging a nightmare.

Moreover, the act of blindly copying code via AI prompts introduces severe copyright risks. If a developer uses an LLM to regenerate code from a proprietary library, even inadvertently, they may be distributing infringing material. This exposes both the developer and their employer to legal liability. The incident mentioned in the source material, where a prompt explicitly asked to copy code from another project, serves as a cautionary tale. It highlights the need for developers to exercise due diligence and verify the origin of any code they incorporate, whether written manually or generated by AI.

Consider the following example of a risky workflow versus a safe one:

Risky Workflow (LLM-Generated):

# Prompt: "Copy the configuration file from project X and adapt it for Y"
# Result: Generated config.yaml containing snippets from Project X's proprietary setup
# Outcome: Potential copyright infringement, lack of understanding of config parameters

Safe Workflow (Human-Verified):

# Action: Developer reviews Project X's public documentation, understands the requirements
# Implementation: Manually writes config.yaml for Project Y, citing sources if necessary
# Outcome: Clear ownership, full understanding, reduced legal risk

Developers must also contend with the social capital of their contributions. Projects that adopt strict "No LLM Code" policies may reject pull requests that appear AI-generated, regardless of their technical merit. This forces developers to demonstrate their understanding of the code they submit. It discourages "prompt and pray" approaches and encourages a deeper engagement with the codebase. For junior developers, this can be a steep learning curve, but it is essential for long-term growth. For senior developers, it reinforces the importance of mentorship and rigorous code review.

Impact on Businesses

For businesses, the rise of unvetted LLM-generated code in dependencies presents a strategic risk that extends beyond mere technical debt. Supply chain security is already a top concern for CTOs and CISOs, but the addition of AI-generated code complicates this picture further. Traditional security audits focus on known vulnerabilities (CVEs) and malicious intent (typosquatting). However, LLM-generated code introduces a new vector: subtle logical errors, hallucinated APIs, and potential IP violations that do not fit neatly into existing security taxonomies.

The financial impact of these risks can be significant. Remediation costs for reverting large, incoherent commits can run into thousands of engineering hours. Legal fees associated with defending against copyright claims can be even higher. Furthermore, the reputational damage of being associated with a project that distributes infringing or low-quality code can erode customer trust. Companies that prioritize rapid deployment over code integrity may find themselves facing a backlog of maintenance issues that stifle innovation rather than accelerate it.

"The illusion of speed provided by AI coding tools is a false economy. When dependencies are tainted with unreviewed AI code, the cost of maintenance, security auditing, and legal compliance skyrockets. Businesses need to invest in governance frameworks that treat AI output as a second-class citizen until it is thoroughly verified." — A Technology Risk Consultant

Strategically, businesses must decide how to handle AI in their development pipelines. Some may choose to ban LLM-generated code entirely, mirroring the stance of git-annex. Others may implement strict verification protocols, requiring developers to provide detailed explanations and diffs for any AI-assisted changes. The latter approach is more feasible for large organizations but requires significant investment in tooling and training. Regardless of the path chosen, the key is awareness. Ignoring the presence of LLM-generated code in dependencies is no longer an option; it is a liability.

Practical Examples

To illustrate the tangible effects of the "No LLM Code in Dependencies" movement, let us examine three concrete scenarios that reflect the challenges faced by maintainers and developers today.

Example 1: The 1489-Line Commit Ambiguity

In a mid-sized open-source library with approximately 26,000 lines of code, a contributor submitted a pull request that modified 10,000 lines across dozens of files. The commit message was a single, incoherent paragraph of 1,489 characters, lacking any technical detail or rationale. Automated tests passed, leading to initial hesitation in rejection. However, the maintainer, suspecting AI generation, performed a manual audit. They discovered that the changes were largely stylistic and involved reformatting code in ways that broke existing conventions without adding value. More critically, the changes made no semantic sense in several places, suggesting the LLM had hallucinated syntax. The pull request was rejected, and the maintainer spent hours documenting the reasons for the rejection to educate the contributor. This incident highlighted the need for clearer guidelines on acceptable change sizes and the necessity of human review for large commits.

Example 2: The Copyright Trap in Configuration Files

A developer working on a frontend framework needed to configure fourmolu for code formatting. Instead of reading the documentation, they prompted an LLM to "add fourmolu config and restyled neat format a module." The LLM generated a configuration file that included snippets copied from a popular proprietary styling guide, which happened to be under a restrictive license. The developer committed this change, and it was merged into a widely used template. Months later, the owner of the proprietary guide discovered the infringement and issued a takedown notice. The template maintainer had to urgently revert the changes, issue an apology, and implement a new review process for all configuration files. This example demonstrates how easy it is to accidentally violate intellectual property rights when relying on AI for boilerplate tasks, and the long-term consequences for the project’s reputation.

Example 3: The Silent Revert in Dependency Chains

A major web framework relied on a utility library that had recently updated its dependencies. One of these dependencies had incorporated code from a smaller, niche project. The niche project’s maintainer, unaware of the incorporation, discovered that their code had been copied verbatim into the larger library’s dependency tree. Upon investigation, it was found that the larger library’s maintainer had used an AI tool to migrate code from the niche project, assuming it was public domain due to its small size. The larger library had to perform a silent revert in the next release, without public explanation, to avoid legal entanglements. This incident caused friction within the community, as users were confused by the sudden removal of functionality. It underscored the opacity of AI-assisted migrations and the difficulty of tracking code provenance in complex dependency graphs.

Common Misconceptions

As the debate over LLM-generated code intensifies, several misconceptions have emerged that hinder productive discourse. Addressing these myths is crucial for fostering a healthy development environment.

Myth: AI-generated code is inherently superior to human-written code because it is faster to produce.
Reality: Speed does not equate to quality. AI-generated code often lacks context, leading to inefficient algorithms or inappropriate design patterns. Furthermore, the time saved in generation is frequently offset by the time required for debugging and reviewing opaque code. Human-written code, especially when reviewed by peers, tends to be more robust and maintainable in the long run.
Myth: Using AI for boilerplate tasks is harmless because it doesn't affect core logic.
Reality: Boilerplate code, such as configuration files, tests, and documentation, is essential for the stability and usability of a project. Errors in these areas can lead to security vulnerabilities, broken builds, or confusing user experiences. Additionally, as seen in the copyright trap example, boilerplate generation can inadvertently introduce legal liabilities.
Myth: If the code compiles and passes tests, it is safe to merge.
Reality: Tests only verify that the code behaves as expected under specific conditions. They do not verify the origin, intent, or legality of the code. AI-generated code can pass tests while still containing logical flaws, security vulnerabilities, or copyrighted material. Human review is necessary to assess the broader implications of a change.
Myth: The "No LLM Code" policy is anti-innovation.
Reality: The policy is pro-integrity. It aims to preserve the quality and trustworthiness of open-source projects. Innovation thrives in environments where contributors feel confident that the codebase is stable, secure, and legally sound. By filtering out low-quality, unverified contributions, maintainers can focus their efforts on meaningful improvements and feature development.

5 Actionable Takeaways

Verify Provenance — Always trace the origin of any code snippet you incorporate, whether written manually or generated by AI, to ensure it is properly licensed and attributable.
Enforce Human Review — Implement mandatory code review processes that require developers to explain the logic behind their changes, preventing the merge of opaque, AI-generated patches.
Limit AI Scope — Restrict the use of LLMs to non-critical tasks like documentation drafting or unit test generation, avoiding their use for core business logic or dependency management.
Audit Dependencies Regularly — Conduct periodic audits of your dependency tree to identify and remove any libraries or versions that contain unverified AI-generated code.
Educate Teams — Train developers on the legal and technical risks of AI-assisted coding, emphasizing the importance of understanding and owning the code they contribute.

What's Next

The trajectory of software development is currently being reshaped by the tension between efficiency and integrity. As LLMs become more sophisticated, the temptation to rely on them for complex coding tasks will only increase. However, the resistance seen in projects like git-annex suggests a counter-movement is forming. This movement is likely to gain momentum as more high-profile incidents of copyright infringement and security breaches related to AI-generated code come to light. We can expect to see the emergence of standardized tools for detecting and flagging AI-generated code, as well as new licensing frameworks that address the unique challenges of machine-authored content.

Furthermore, the role of maintainers will evolve. They will increasingly act as gatekeepers of quality and legality, investing significant time in auditing dependencies and enforcing strict contribution guidelines. This may lead to a fragmentation of the open-source ecosystem, with some projects becoming highly curated and exclusive, while others remain open but chaotic. Developers will need to adapt by cultivating deeper technical skills and a stronger understanding of legal compliance. The era of the "10x developer" based on prompt engineering may give way to the era of the "1x developer" based on deep expertise and careful stewardship.

Conclusion

The push for "No LLM Code in Dependencies" is not merely a technical preference; it is a moral and legal imperative for the sustainability of the open-source ecosystem. The 100 hours of work invested by a single maintainer to audit dependencies is a testament to the scale of the challenge and the dedication required to uphold code integrity. As we navigate this new landscape, we must remember that software is not just about function; it is about trust, community, and responsibility.

The dominoes are falling, and the industry is waking up to the hidden costs of unchecked automation. While holding back the tide of AI-generated noise may seem like a futile effort, it is a necessary one. Without it, the foundation of our digital infrastructure could become unstable, fraught with legal risks and technical debt. Let this be a call to action for developers, businesses, and maintainers to prioritize quality over speed, and integrity over convenience. In doing so, we ensure that the software we build remains reliable, secure, and worthy of our collective trust.

DEV Community