Stack Overflowed

Posted on Mar 16

What Large Language Model Does GitHub Copilot Use?

#programming #ai #githubcopilot #webdev

If you use GitHub Copilot regularly, you’ve probably wondered what’s actually powering it.

When it finishes your function before you do, it feels like magic. When Copilot Chat explains a complex class hierarchy in clean, structured language, it feels like you’re talking to a senior engineer. And when it occasionally produces something that almost works but not quite, you’re reminded that there’s a probabilistic engine underneath the surface.

So what large language model does GitHub Copilot use?

The honest answer is this: Copilot started with OpenAI Codex, a GPT-3–based model optimized for code, and has since evolved into a multi-model system that incorporates GPT-4–class models for advanced reasoning features.

It does not run on one single static model. It uses different large language models depending on what you’re asking it to do.

If you want to understand Copilot properly, you need to understand that layered architecture.

Let’s break it down clearly and in depth.

What an LLM actually does in a coding assistant

Before diving into Copilot’s specific models, it helps to ground yourself in what an LLM does in a development environment.

A large language model is a neural network trained to predict the next token in a sequence. In a coding assistant, that “sequence” might be natural language comments, code syntax, documentation strings, or even patterns across repositories.

The model doesn’t truly understand your code the way you do. It doesn’t run it. It doesn’t test it. It predicts what likely comes next based on patterns learned during training.

The more advanced the LLM, the better it handles longer context, multi-step reasoning, and nuanced relationships between code and comments.

So when you ask what LLM Copilot uses, you’re really asking what level of reasoning and contextual awareness is available behind your editor.

The beginning: Copilot and OpenAI Codex

When GitHub Copilot launched in 2021, it was powered by OpenAI Codex.

Codex was a descendant of GPT-3, but heavily fine-tuned on publicly available source code. It was designed specifically to generate programming code across multiple languages. Compared to vanilla GPT-3, Codex was dramatically better at producing syntactically valid and contextually relevant code.

At launch, Copilot focused primarily on inline autocomplete. You could write a comment describing a function, and Codex would generate the implementation. You could start a loop or a conditional, and Copilot would fill it out.

Here’s how early Copilot aligned with its model:

Feature	Model Used	Model Family
Inline code completion	OpenAI Codex	GPT-3 derivative
Comment-to-code generation	Codex	GPT-3 derivative
Basic test scaffolding	Codex	GPT-3 derivative
Multi-line suggestions	Codex	GPT-3 derivative

Codex was powerful, but it had limits. Long-range reasoning across files was inconsistent. Deep architectural suggestions were hit-or-miss. Context windows were relatively constrained.

That’s where newer GPT-4–class models changed the game.

The evolution toward GPT-4–class models

When OpenAI released GPT-4, it marked a substantial leap in reasoning ability, contextual depth, and structured output quality.

GPT-4 demonstrated stronger multi-step logical consistency, improved code reasoning, and better handling of complex prompts. Naturally, GitHub began integrating these advancements into Copilot.

But Copilot didn’t simply switch from Codex to GPT-4 overnight.

Instead, it evolved into a multi-model system. Today, Copilot dynamically routes tasks to different large language models depending on the feature, complexity, and subscription tier.

In simplified terms, here’s how it works:

Copilot Capability	Model Type Typically Used
Real-time inline autocomplete	Optimized code-specific LLM
Copilot Chat	GPT-4–class model
Code explanation	GPT-4–class model
Refactoring assistance	GPT-4–class or hybrid
Repository-level reasoning (Enterprise)	GPT-4–class with extended context

This architecture allows Copilot to remain fast for typing while still being powerful for deeper reasoning.

Why Copilot doesn’t use GPT-4 for everything

It’s tempting to assume that Copilot now runs entirely on GPT-4. From a user perspective, that sounds ideal.

From a systems perspective, it would be inefficient.

GPT-4–class models are computationally expensive and slower than smaller models. If every keystroke triggered GPT-4 inference, latency would increase, and costs would rise dramatically.

Instead, Copilot uses orchestration.

Smaller, optimized models handle rapid token-by-token prediction for inline suggestions. When you invoke Copilot Chat or request a complex explanation, the system calls more advanced GPT-4–class models.

This layered strategy balances responsiveness with intelligence.

You get near-instant autocomplete when writing boilerplate. You get deeper reasoning when asking higher-level questions.

Copilot Chat: Where GPT-4 becomes visible

If you want to see GPT-4–level reasoning inside Copilot, use Copilot Chat.

Copilot Chat transforms the experience from predictive typing into interactive problem solving. You can highlight code and ask for explanations. You can request refactoring suggestions. You can generate unit tests for entire files.

These tasks require multi-step reasoning and long context awareness. GPT-4 excels in exactly those areas.

Here’s a comparison that highlights the difference between lighter inline models and GPT-4–class reasoning:

Capability	Inline Completion Model	GPT-4–Class Model
Token prediction speed	Very fast	Slower
Multi-file reasoning	Limited	Advanced
Detailed explanations	Basic	Structured and nuanced
Complex algorithm design	Inconsistent	More reliable
Long context handling	Moderate	Strong

When you feel Copilot “thinking harder,” GPT-4–class models are likely involved.

Does Copilot still use Codex today?

Codex was the original foundation, but OpenAI’s model ecosystem has evolved significantly since 2021.

The Codex name is less prominent today, and modern Copilot implementations likely use successors built on GPT-3.5–class and GPT-4–class architectures fine-tuned specifically for coding.

In practical terms, Copilot has outgrown its original Codex identity. It now operates within a broader GPT-based ecosystem.

So if someone asks whether Copilot still runs on the original Codex model, the more accurate answer is that it has evolved into a more advanced, multi-model system.

How subscription tiers influence the LLM experience

Not all Copilot users interact with identical backend capabilities.

Copilot Individual, Business, and Enterprise tiers unlock different levels of context awareness and governance features. Enterprise users, in particular, benefit from deeper repository indexing and extended context windows.

Those enhanced capabilities are typically powered by GPT-4–class models.

Here’s a conceptual breakdown:

Plan	Inline Model	Chat / Advanced Reasoning Model
Individual	Optimized code LLM	GPT-4–class
Business	Optimized code LLM	GPT-4–class
Enterprise	Advanced code LLM	GPT-4–class with extended context

Exact backend configurations may change over time, but the layered model strategy remains consistent.

Copilot vs ChatGPT: Similar foundation, different integration

Because both Copilot and ChatGPT can use GPT-4–class models, you might assume they are interchangeable.

They are not.

Copilot is deeply integrated into your IDE. It understands your file context. In enterprise settings, it can reason across repositories. It is optimized specifically for coding workflows.

ChatGPT, unless connected through APIs or plugins, does not have native awareness of your codebase.

Here’s how they compare at a high level:

Feature	GitHub Copilot	ChatGPT
Inline code completion	Yes	No
IDE integration	Deep	None natively
GPT-4 support	Yes (feature dependent)	Yes (plan dependent)
Repository awareness	Enterprise tier	Not native
Primary use case	Developer productivity	General-purpose AI

They share model families, but the product experience differs significantly.

Why GitHub doesn’t publish exact model versions

You may notice that GitHub rarely specifies exact model version numbers for Copilot.

That’s because Copilot uses dynamic routing. Different requests may be sent to different models depending on complexity. Backend systems evolve frequently. Publishing fixed version numbers would quickly become outdated.

Copilot is a managed cloud service. Improvements roll out continuously without requiring you to upgrade anything locally.

The lack of specific version numbers does not mean GPT-4 isn’t involved. It reflects the fluid nature of AI infrastructure.

What this means for you as a developer

Understanding what LLM Copilot uses changes how you use it.

When you’re generating simple boilerplate, optimized code models handle the task quickly and efficiently. When you’re debugging or asking for architectural advice in Copilot Chat, GPT-4–class reasoning likely steps in.

That means you should adjust expectations based on the task.

Copilot is strongest when you provide clear context and treat it as a collaborator. It accelerates your workflow, but it does not replace engineering judgment.

Knowing the layered LLM architecture helps you use it more strategically.

The future of Copilot’s model stack

The question is not just what model Copilot uses today, but how it will evolve.

OpenAI continues releasing new model generations. Microsoft integrates those advancements across its ecosystem. Copilot benefits from these upgrades because it runs as a cloud-based service.

The LLM powering Copilot next year may not be the same as today’s.

What remains consistent is the orchestration strategy: combining fast code-optimized models with deeper GPT-4–class reasoning models to balance speed and intelligence.

Final answer

GitHub Copilot originally launched using OpenAI Codex, a GPT-3–derived model fine-tuned for programming tasks. Today, Copilot uses a layered architecture that incorporates GPT-4–class large language models for advanced reasoning features, especially in Copilot Chat and enterprise-level capabilities.

It does not rely on a single static LLM. Instead, it dynamically combines optimized code-specific models for rapid autocomplete with GPT-4–level models for deeper analysis and conversation.

If you want the most accurate answer, it’s this: GitHub Copilot uses a combination of code-specialized LLMs and GPT-4–class models, orchestrated intelligently depending on what you’re asking it to do.

And once you understand that, you understand not just the model name, but the system behind the experience.

DEV Community