Francesca

Posted on Jun 6

Frequency Bias in LLM Coding Assistants: Fairness Risks for Software Development.

#ai #aibias #llm #writing

Thinking about how AI coding assistants shape developer choices, and the hidden biases influencing programming decisions. >_<
cross-posted from Medium .

This report is an examination of fairness risks across large language model (LLM) coding assistants, such as Github Copilot and Claude Code, which are increasingly used to supplement and support programming tasks.

Such AI Assistive tools influence developers’ choices of programming languages and libraries across software projects. A key fairness concern here is the frequency bias initiated in the selection of coding languages. Because coding assistants generate code based on patterns learned from large training datasets, they tend to favour languages and tools that appear most frequently across those datasets. As a result, developers may be steered toward widely used technologies even when alternative tools may be more appropriate for a given task.

This bias may create several harms, including the introduction of ‘technical debt’ through suboptimal tool choices and the suppression of emerging technologies from newer programmers. This report addresses these risks and finalising with recommendations helpful for improving transparency around how coding tools select languages and libraries.

I. Overview

A. LLM Coding Assistants in Software Development

AI coding assistants are increasingly being respected as “powerful tools that aid in the software life cycle and improve developer productivity.” (Perry et al. 2023) Such systems generate code by predicting sequences based on patterns previously learned from large datasets of technical documentation.

Moreover, these coding assistants have the “potential to lower the barrier of entry for programming” while simultaneously “increasing developer productivity” (Tabachnyk et.al 2023). However, because these systems generate code from pattern reproduction, their outputs may also reflect biases embedded within those datasets. Understanding how such biases influence the programming choices suggested by AI systems is therefore essential in assessing the fairness and bias within AI-assisted software development.

B. The Emergence of “Vibe Coding” Innovations in AI-assisted programming have contributed to the emergence of “vibe coding,” a process in which developers “express their intention using plain speech and the AI transforms that thinking into executable code” (Harkar, 2025). Via vibe-coding, responsibility is delegated to coding assistants, which influence major programming decisions. Such decisions can result in coding bias “shaped by factors such as the prevalence of particular libraries and programming languages in open-source repositories” (Wang et al. 2024). As a result, widely used languages are disproportionately favoured by coding assistants, even in contexts where alternative languages might provide more appropriate solutions.

II. Sources of Bias in LLM Code Generation

Bias can emerge both from the behaviour of models during code generation and from the dataset’s composition and framework. Because these systems rely on patterns learned from a large training corpora, it is likely they will reproduce imbalances pre-existent within those datasets.

A. Language and Library Preferences in Generated Code

Evidence of language bias was observed in a 2025 study examining LLM language preferences. Twist et al. found that popular coding assistants often favour widely used languages and libraries even when those choices are suboptimal for the task.

In project initialisation tasks designed for high performance and low latency, the models disproportionately selected Python. “Even for tasks where high performance and memory safety are critical — and therefore Python is suboptimal — Python remains the dominant choice… in 58% of cases, while Rust is not used a single time.” (Twist et al., 2025). Similar patterns were observed in library selection, with “Flask appearing in 88% of generated web-server implementations, whereas FastAPI (despite offering improved performance) was used in only 9% of cases.” Twist et al’s findings exemplify that coding assistants are favouring familiar technologies rather than selecting tools based on task-specific suitability.

B. Training Data Frequency Bias

LLMs generate code by predicting sequences of tokens based on patterns learned during training, as a result, programming languages and libraries that appear more frequently across training datasets are more likely to be generated by an AI coding co-pilot when producing code. Many LLM training datasets are “mainly collected from publicly available sources such as open-source repositories.” (Majdinasab et al., 2024) and if certain coding languages are dominating these then, naturally, LLM’s will favour those languages. The Octoverse report evidenced this, noting in 2024 spikes in Python usage, with “Python recently recording the highest number of contributors on GitHub’” (GitHub Staff, 2024), suggesting that its widespread presence in developer ecosystems may be influencing LLM generation.

C. Efficiency Trade-offs in AI Coding Assistants

Whilst utilizing AI assistants has “seen a 40% reduction in onboarding time for new engineers” (Ryz Labs, 2026) there is a very clear trade-off between short term efficiency and long-term robustness. By scaling interfaces rapidly, AI coding assistants can falsely instil competence in users who do not understand the underlying programming decisions. As a result, newer developers may become blindly reliant on the default languages and tools suggested by the system. The productivity gains offered by coding assistants must therefore be weighted against the risk of developers defaulting to the standard ‘safe’ familiar tools, thus concentrating usage around a small number of already established languages.

III. Impacted Groups and Harms

A. Developers and Software Engineering Practices

One group directly affected by bias in LLM-generated code is software developers themselves. A trend “evident across vibe-coding scenarios is that coders may increasingly lack the expertise to evaluate which programming languages are most appropriate for a given task and therefore rely on the default choices suggested by LLMs” (Smith, 2025). If coding assistants systematically favour certain languages, then developers may unintentionally adopt suboptimal tools that “introduce technical debt and security risks” (Zhu et al. 2025) into one’s code.

B. Startups and Early-Stage Developers

Vibe coding has significantly influenced how startups build and validate products. AI-assisted coding allows for “faster experimentation and reduces development costs”. As a result, many “early-stage AI-powered companies are reaching later funding stages more quickly” due to the accelerated experimentation enabled by these tools (J.P. Morgan, 2025). However, startups that rely heavily on AI-generated code may lack the expertise to critically evaluate these technical decisions over time. This risk is compounded by developers’ limited ability to audit AI-generated code, with early-stage developers admitting “struggling with understanding and debugging their generated code” (Perry et al., 2023).

C. Open-Source Ecosystems

Bias toward “well-established languages can actively block newer languages and open-source tools” (Twist et al. 2025). This affects open-source developers whose projects rely on visibility and adoption within developer communities. When coding assistants recommend a narrow set of dominant tools, newer technologies may encounter a “discoverability barrier”. This form of “language and library favouritism” narrows the “diversity of the global codebase and disadvantages smaller open-source communities” (Addanki, 2026) whose tools are less represented in training data.

IV. Interventions & Recommendations

As we integrate LLMs into our software development lifecycles, we must move beyond passive “vibe-coding”, acknowledging that AI coding tools are not completely neutral architects.

A. Evaluation Benchmarks for Coding Assistants

Developers of these coding assistants must introduce evaluation benchmarks that assess whether models display systematic preferences for particular programming languages. By testing model outputs across a range of programming tasks, researchers can identify when and where certain technologies are favoured. Establishing benchmarks would make biases in code generation more visible and allow developers to refine systems accordingly.

B. Multi-Solution Generation

Coding co-pilots should generate multiple implementation options rather than singular solutions. By suggesting alternative programming options within one task, developers will be encouraged to actively compare different approaches before making technical decisions. This approach is supportive of a more deliberate tool selection process.

C. Transparent Prompting in Code Generation

Providing clearer documentation on the composition of training datasets would allow developers to evaluate whether coding co-pilots are favouring particular programming language. Prompting strategies that encourage models to “explain why a particular language or library is selected” (Kruspe, 2024) will improve transparency in code generation which will allow developers to distinguish between recommendations based on technical suitability and recommendations simply shaped by training-data frequency.

V. Limitations and Uncertainty

By dissecting frequency bias in LLM coding assistants, this report takes on the assumption that training data composition can negatively influence LLM coding prompts. However, it is important to note that the datasets used to train many coding assistants are not publicly disclosed, making it difficult to verify the extent to which dataset composition is driving language preferences. Because of this, the analysis relies partly on indirect evidence from developer ecosystems and patterns researchers have observed across model outputs. Additionally, co-piloting coding prompts may also be influenced by other factors such as prompt design or by a process of reinforcement learning (something not acknowledged within this report), such limitations suggest that the relationship between training data frequency and model outputs may be more complex than taking training data and mass applying it to ‘vibe coding’ prompts.

VI. Conclusion

As the practice of “vibe coding” continues to enmesh itself across the software development landscape, biases embedded in training data risk shaping the future of programming languages recommended by coding copilots. At worst this could result in a ‘python monolith’ with newer, more innovative tools and languages being superseded by dominant coding techniques. Reducing this risk requires better curation and documentation of such training data alongside greater discernment from developers when adopting AI-generated code.

Bibliography:

Addanki, S. (2026) Your AI coding assistant has a favorite language — and it’s not always the right one. Your AI Coding Assistant Has a Favorite Language — And It’s Not Always the Right One | LinkedIn

GitHub Staff (2024) Octoverse: AI leads Python to top language as the number of global developers surges. GitHub Blog. https://github.blog/news-insights/octoverse/octoverse-2024

Harkar, S. (n.d.) Vibe coding. IBM Think. What is Vibe Coding? | IBM

J.P. Morgan (2025) Vibe coding: A guide for startups and founders. https://www.jpmorgan.com/insights/technology/artificial-intelligence/vibe-coding-a-guide-for-startups-and-founders

Kruspe, A. (2024) Towards detecting unanticipated bias in large language models. arXiv. https://arxiv.org/abs/2404.02650

Majdinasab, V., Nikanjam, A. and Khomh, F. (2024) Trained Without My Consent: Detecting Code Inclusion in Language Models Trained on Code. arXiv. https://arxiv.org/abs/2402.09299

Perry, N., Srivastava, M., Kumar, D. and Gupta, M. (2023) Do users write more insecure code with AI assistants? Do Users Write More Insecure Code with AI Assistants?

Ryz Labs (2026) ‘5 unexpected benefits of using AI coding assistants that no one talks about’. | Ryz Labs | Ryz Labs Learn

Smith, M. (2025) AI vibe coding: Engineers’ secret to fast development. IEEE Spectrum. Vibe Coding: Engineers’ Secret to Fast Development — IEEE Spectrum

Tabachnyk, M. and Nikolov, S. (2023) ML-enhanced code completion improves developer productivity. Google Research Blog. ML-Enhanced Code Completion Improves Developer Productivity

Twist, J., et al. (2025) A study of LLMs’ preferences for libraries and programming languages. A Study of LLMs’ Preferences for Libraries and Programming Languages

Wang, C., et al. (2024) Exploring multi-lingual bias of large code models in code generation. arXiv. https://arxiv.org/abs/2404.19368

Zhu, X., et al. (2025) How to synthesize text data without model collapse? arXiv. http://arxiv.org/abs/2412.14689