Maxim Saplin

Posted on Feb 18 • Edited on Apr 1

Ran out of Cursor tokens and switched to GitHub Copilot: Side-by-Side

#programming #ai #productivity #githubcopilot

Update, April 1 (and this is not a joke). Insider Preview version is way more usable and capable as of now. Throughout February and March I have seen a flow updates and most of the below concerns I've brought up are now fixed. Noticed a few Microsoft employee views in my LinkedIn in Feb, could it be this blog post turned into a backlog? :)

DISCLAIMER! The best AI coding tool is the one available to you, that gives you the best model and reasonable token limits. From the text below it might look like GitHub Copilot is a horrible product - it's not. I use Copilot and I'm productive. It's just an irritating experience when I switch from Cursor.

The banner is a screenshot from my Cursor 2025 retrospective with almost 1T tokens used - I guess one might call me a heavy user. I've been using it since 2023 and it happens to be my favourite VSCode fork. I tried different AI assisted IDEs: Kiro, Antigravity, Windsurf, Project IDX; used VSCode extensions such as Continue, Cody.

When my monthly token limit in Cursor ran out last December, I've been spending more time with GH Copilot (the Insider Preview version with the newest features). Before that I occasionally used Copilot and mostly followed its progress from media/posts and my colleagues' discussions. It's hard to miss the major AI Coding assistant which Copilot is. Since 2023 I have formed an opinion that GH Copilot is an inferior product compared to Cursor which lagged by ~6 months. Recently the gap in new feature releases in Copilot has narrowed yet the execution is not great.

What I don't like about Copilot

Plan Mode is a gray piece of misery compared to Cursor's implementation. I use it a lot in Cursor but see no reason to use it in Copilot. When I tried it for the first time in GH I didn't even understand that the plan was provided - it was just a few paragraphs of text produced by a subagent and clicking the 'Proceed' button just switched the mode to 'Agent' and pasted 'Proceed' text into chat. All of that seemed like a waste of tokens on subagent that did many tool calls and provided a very generic response. In Cursor you get a detailed and structured .MD plan; there's a 'Build' button allowing you to spawn a new agent in a new dialog (with a different model of choice and a clean context); or you can proceed implementing it in the same thread.

Dialog features are poor (and it's the core of UX). For example, you can't clone dialogs or branch out from certain messages in the middle - something I used a lot in Cursor to manage the ever growing threads and context overflows. There are a few more conveniences around overall UX that are missing in GH and keep the experience irritating (e.g., jumpy prompt input, adding a selected piece of a file to the dialog was not instantly apparent due to a faint animation, etc.)

There's no manual dialog summarisation, only automatic. Here's how I got trapped by this "feature"... In the middle of a chat (and I had no idea how big the chat was, since there was no token counter; otherwise I'd have branched it into a new thread) I typed "Proceed". After the implementation started and I saw a few tool calls summarisation kicked in and the agent got lost and "What do you want me to proceed with?".

Token counter missing for too long. Insider preview has added this feature at the end of January.
- The issue requesting the feature in Copilot has been sitting since April 2025 and collected many reactions. Cursor had the context window usage indicator since I can't remember when.
Shorter context windows. For example, GPT-5 family has 272K input limit and Anthropic's Claude models by default allow for 200K total context size. I had this perception that in Copilot my dialogs hit the summarisation threshold sooner than in Cursor - turns out there's a reason for that. Why have these low defaults?

Gemini 3 Pro instability. My favourite model of November randomly threw errors in longer dialogs - trying Again didn't help; I had to drop those dialogs or switch models. Never noticed this instability in Cursor.
GitHub instructions look inferior to Cursor's rules. For example, there are no semantic rules - where an agent pulls relevant instructions automatically. I even had to do a small workaround for that handy feature. Recently Insider Preview added support of Agent Skills which does exactly that, yet
Piling-up legacy in prompts management. There are instructions, chat modes, different approaches to prompts - recently when doing a cleanup in our teams repo where GH Copilot was used there were a lot of questions around "how do I do my guardrails properly". A good example in my opinion is how Cursor dropped its Rules discipline making Agent Skills the default choice and instantly provided a migration path for existing Cursor rules/commands.
- This also gives another example of a half-baked feature in Copilot. Agent Skills in Copilot are automatic only - the model decides when the skill is pulled into the thread. And for some reason there's no way to explicitly reference the skill. We used /spec and /task slash commands for Spec-Driven development, and those are called explicitly. When introducing Agent Skill Cursor added both option to use those - automatic or via slash commands.
Missing Multi-model parallel agents - Cursor allows you to pick several models to process a single prompt; each one creates a Git worktree and you can proceed working in the worktree you liked the most. Copilot has a Background agent feature allowing you to spin up a new GH Copilot CLI agent - while it also relies on a worktree it doesn't give the same convenience.

Getting newer models can be slow. GH announcements of model availability in Copilot come the same day the model is introduced. Yet it's often opt-in when Copilot subscription admins enable new models manually. In the case of Cursor I learn about new model releases from its model picker
No choice of reasoning effort for models. For example, for GPT-5.2 there's only a single line in the picker, while in Cursor there are 8 options ( low, medium, high, xhigh, and then the same four with the -fast suffix, which is twice as expensive but faster). Technically, one can switch reasoning effort to "High" for OpenAI models, though only under experimental setting "Chat: Responses Api Reasoning Effort", which is a bit awkward and hard-to-reach feature.

Restoring checkpoints can be unreliable. I ended up with a broken solution a few times when going back in chat history. Frankly, it is not always reliable in Cursor either; sometimes agents tend to make changes bypassing standard edit tools. It just seems GH checkpoint restoring was less reliable.
System prompts seem awkward and less effective. For instance, in Copilot I often get the agent responding with a "Plan" section after it completes a long thread. Essentially it fills the top of its report with a scroll of what the plan was. Who cares when job is done? Very confusing after switching from Cursor. Besides, when using Copilot in CLI it often gets the intent wrong and doesn't produce the right command, requiring further interaction.

The recent Cursor release of subagents is yet to be matched by Copilot. The UX is better; the whole orchestration seems more polished. See below how in Cursor I kicked off parallel agents in their own worktrees which in turn kicked off subagents - all in one click. Compare to the very simplistic GH variant:

Models in Copilot can't view image files - you can only paste an image into chat; this way they do see images, otherwise they are blind. Use case? Using ADB to take screenshots and saving them in PNG for further inspection - it took me hours running failing verification loops before I realized Copilot lacked that trivial ability. Cursor does this well.

What I Like about Copilot

(Long awaited) Token counter gives a breakdown. It's curious to observe how agentic coding has recently leaped forward due to verification - you can easily check how much tool call results occupy in the dialog.

You can inspect prompts - under "Output > GitHub Copilot Cha"t you can view very detailed LLM traces. For example, you can see what sort of prompts are used to wrap your interactions, might be useful, especially if you like tinkering.

Open about standard tools - there's no UI in Cursor to control standard tool selection, only MCP ones. If you are up for tinkering you can configure tool bundles, can see their exact names. For example, I often explicitly ask GH to use the runSubagent tool to delegate to subagents - works like a charm for bigger tasks.

Kinda open-source - while the back-end part has not been open-sourced, the extension has been. Besides, many AI coding assistant features have been merged into vscode directly, making the creation of third-party extensions much easier. Though it's a pity that GH Copilot always requires a sign-in locking out of true local LLM use - the ticket for that is very popular and has been sitting for almost a year.
Easier installation of MCP - I found the integration in GH easier (button click); with Cursor I had to update config files.
Ecosystem and integration with GitHub - you have Copilot integrated in GH web app; you can easily assign issues to Cloud agents via you phone while browsing GitHub; the extension is accessible in plenty of IDEs (though people say non-VSCode IDEs struggle with feature parity). They have recently added support for Claude Code and Codex allowing you to run other major coding agents through a GH subscription. The breadth and outreach of Copilot is great.

More tokens - it feels like GH's premium requests model allows for more usage compared to Cursor's token-based pricing. Unfortunately there's no user-facing dashboard in Copilot to draw a clear comparison.

From the Creators of SharePoint...

Pun intended. Corporate touch adds a certain flavour making software disgusting. SharePoint or Dynamics CRM are in my view classical examples - ugly UI, slow. The ".aspx" extensions in URLs remind of decades-old ASP.NET Web Forms used to build them.

Somehow GitHub Copilot follows in the steps of other corporate products... It often feels like software that is created by people who (a) don't use it and (b) don't care. A product built by a slideware company.

Just recently this "don't care" approach has surfaced when a user discovered an exploit to bypass billing. That was hilarious! A vulnerability report was submitted privately to Microsoft Security Response Center; the folks there told that billing wasn't their responsibility and advised to create a ticket on a public GitHub repo - where everyone could see the exploit and free-ride Microsoft on tokens. And even after that the GH issue got closed automatically by some AI bot. A few days later it was re-opened after the exploit received public attention and media coverage.

Copilot vs Others might be a yet another Harvard Business School case study on how a large established company turns slow and loses touch with the market, while more nimble and energetic startups build better products.

Cursor's Apple Magic

"It just works" often comes to my mind when I use Cursor. There aren't that many options and toggles. They like building minimalist and refined UI (one of the reasons I don't like GitHub - because it's often ugly to my eye). A small example, Copilot in CLI:

Vs. Cursor:

There's a bit of closedness and secrecy at AnySphere. Take for example their Composer release where they compare their model to an unnamed best-on-the-market model and vaguely describe what they did - not even mentioning what the context window size for the new model is. Or how they implemented the "use your own API key" feature when they process all LLM requests on their back-end making use within a closed perimeter impossible.

Apple vs. Microsoft, iOS vs. Android, startup vs. enterprise - all those analogies sum up my impressions when comparing Cursor to Copilot.

Top comments (19)

Ben Halpern • Feb 18

Great writeup. Very relatable initial premise.

Mykola Kondratiuk • Feb 19

hit this exact wall. cursor's fast-apply mode eats tokens insanely fast on anything with large files - one decent refactor session and you're done for the day. copilot's subscription model removes that anxiety completely which is genuinely underrated. but i kept missing cursor's codebase awareness - the way it just knows what you're working on without you having to explain context every time. ended up going back and being way more deliberate about when i trigger expensive operations

Kai Alder • Feb 22

The config portability problem Ned mentioned is what kills me. I've been burned by this enough that I now keep a tool-agnostic AGENTS.md at the project root — both Cursor and Copilot pick it up, and if I need to bail to Claude Code or something else, the context carries over.

Your point about Copilot's plan mode being a "gray piece of misery" made me laugh. I tried it once, got a wall of text from a subagent that basically restated my prompt, clicked Proceed, and it just... started over. Never used it again. Cursor's structured .MD plan with the Build button is genuinely one of the best features in any coding tool right now.

One thing I'm curious about — with nearly 1T tokens used in 2025, what's your monthly Cursor bill looking like? I've been hesitant to go all-in on agentic loops because the token burn gets wild fast, especially with subagents. Do you find the productivity gain justifies it vs being more deliberate with prompts?

Maxim Saplin • Feb 25

Last year I had $150 monthly credit in Cursor and by the end of the year I started chronically hitting the limit... Now at $300 at still not enough to last even 2 weeks, relying on my GH Copilot subscription more due to that

MaxxMini • Feb 18

Interesting comparison! I hit the same token limit wall with Cursor and ended up going a completely different direction - CLI-based agents (Claude Code) orchestrating sub-agents for parallel tasks instead of relying on IDE integrations. The biggest win was automating the workflow around coding: draft generation, deployment verification, health checks, all chained together. Went from manually doing everything to 80+ automation scripts in a week. Curious if you have tried any non-IDE approaches? The Plan Mode frustration you described is exactly why I moved away from IDE-embedded AI - too many layers between intent and execution.

Maxim Saplin • Feb 19

I use OpenCode quite often, even created a skill to launch resumae subagents powered by OpenCode. CLI tools are great if you want to build your own pipelines for sure. In terms of general use, see no major difference in capability, be it GUI or TUI. Question of preference and token availability

MaxxMini • Feb 19

Good point about GUI vs TUI being mostly preference — I agree there's no fundamental capability gap anymore. OpenCode looks interesting, hadn't seen the resumable subagent pattern before. How do you handle context window limits when your subagents are working on larger codebases? That's been my main pain point with CLI agents — they burn through tokens fast on big repos even with good .clinerules or AGENTS.md scoping.

Maxim Saplin • Feb 19

Frankly I have not seen subagents to ever overflow context, that's not that easy, did multiple attempts and still exploring long-horizon subagents orchestration: github.com/maxim-saplin/hyperlink_...

So far implementations I saw (runSubagent tool in GH and task in OpenCode) do simple request/response disposable dialogs where an agent sessions is started and than the main agent spinning the subagent only cares about the final result.

Sam Winchester • Feb 24 • Edited

I’ve been debating making the switch myself because of the token limits. The breakdown of the 'Checkpoint' reliability is a huge factor I hadn't considered. Definitely leaning towards keeping both installed for different use cases now. 🚀 Great write-up!

Matthew Hou • Feb 20

The token anxiety is real. I've started thinking of Cursor credits the same way I think about AWS costs — you don't realize how much you're burning until the bill comes.

The interesting thing in your comparison: Copilot's advantage is the deep VS Code integration. When it works, it feels like the IDE itself understands what you're trying to do. But Cursor's context handling for multi-file changes is noticeably better. I end up using both depending on the task type.

What's your usage pattern — mostly completions or more chat-based generation?

Maxim Saplin • Feb 20

Don't do completions at all, mostly ahentic loops and relying more on subagents to squuze more scope into single thread

Matthew Hou • Feb 21

Token cost tracking needs to be built in from day one, not added after the first surprise invoice. The asymmetry is that costs scale with task complexity, not user count — which breaks the mental model most people have from pricing cloud compute.

member_fc281ffe • Feb 22

The plan mode difference is the most telling signal in this comparison. Structured output with an explicit build step versus a paragraph response maps directly to how well each tool fits into an existing workflow versus expecting you to adapt to it. The deeper question is whether the IDE integration advantage compounds over time or whether the two converge as agents become more autonomous.

Ned C • Feb 20

switching between tools sucks because none of the config transfers. i've got .cursor/rules files tuned for how i work, and if i switch to copilot or claude code for a week, all of that context is gone. i have to re-explain my project conventions from scratch every session.

i built a linter for cursor rules partly because of this. wanted to at least know if my rules were valid before i invested time tuning them for a tool i might have to abandon next month. the whole ecosystem feels like it's one pricing change away from forcing a migration nobody's config is ready for.

Maxim Saplin • Feb 20

Cursor deprecated rules and proposed to migrate to agent skills, putting skills to .claude/skills makes those skills discoverable by major tools. Same goes for AGENTS.md as alternative to coplot-intructions

Ned C • Feb 21

rules aren't deprecated though. cursor's docs still have a full active page for them with four types (project rules, user rules, team rules, AGENTS.md), and the v2.4 changelog explicitly positions skills as complementary: "compared to always-on, declarative rules, skills are better for dynamic context discovery and procedural how-to instructions."

there is a /migrate-to-skills command, but from what i can tell it converts dynamic rules and slash commands into skills, not all rules. the use cases are different: rules are declarative and always-on ("use TypeScript strict mode"), skills are procedural and on-demand ("here's how to deploy to AWS").

you're right about .claude/skills/ being cross-tool discoverable though. cursor auto-discovers .claude/skills/, .codex/skills/, and .cursor/skills/. that part of the ecosystem is converging. but for the kind of stuff most people put in .cursor/rules/ (coding conventions, style enforcement, framework patterns), rules are still the right tool.

Harjot Singh • May 30

Switching the moment you hit the token wall is the rational move, but notice the pattern - you didn't switch because Copilot was better, you switched because the meter forced you. That's reactive migration, and it costs you the relearning tax every time a provider changes limits or pricing.

The side-by-side is genuinely useful, but the durable answer is to not be locked to either: route work to whatever backend is cheapest/best per task, so "ran out of tokens on X" becomes "fall back to Y automatically" instead of a manual scramble and a new tool to learn. The comparison data you gathered is exactly what you'd feed into that routing decision. Curious whether Copilot actually held up on the harder tasks or just the everyday stuff?

View full discussion (19 comments)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.