DEV Community: Xunxing Mao

In the Age of AI, What Terminal Tools Should We Be Using?

Xunxing Mao — Thu, 30 Apr 2026 01:32:36 +0000

This article was originally published on maoxunxing.com.

Today I want to talk about something every developer uses daily, but few people really think deeply about: the command line.

On macOS, the classic setup has long been:

iTerm2 + Oh My Zsh

iTerm2 gives you a better terminal experience—tabs, panes, profiles—while Oh My Zsh makes Zsh more usable with plugins, themes, and autocomplete.

This setup worked extremely well.

But in the age of AI, the role of the terminal is changing.

It’s no longer just a place to “run commands.”

It’s becoming a development entry point:

You run projects, inspect logs, manage Git, launch AI agents, and even let AI modify your code—all inside the terminal.

So the real question is:

In the AI era, what terminal tools should we actually be using?

The key insight is this:

You shouldn’t think in terms of a single tool—you should think in layers.

Terminal Emulator → iTerm2, Ghostty, Warp
Shell → Zsh, Fish, Bash
Workflow Tools → tmux, LazyGit, fzf
AI Layer → Warp AI, Claude Code, Codex CLI, Gemini CLI

What you’re really building is a terminal workflow system, not picking a single app.

1. iTerm2: Stable, Mature—but Not AI-Native

If you’re on macOS, iTerm2 is still a very solid choice.

Its strengths are obvious:

Mature and stable
Rich features (split panes, tabs, profiles, shortcuts)
Highly customizable

But it follows a traditional model:

It’s optimized for humans typing commands, not AI-assisted workflows.

If your workflow is still:

Running services
SSH
Git commands
Logs

Then iTerm2 is more than enough.

But if you’re heavily using AI tools like Cursor, Claude Code, Codex CLI, or Gemini CLI, iTerm2 itself doesn’t give you much AI leverage.

Verdict:

iTerm2 is perfect for conservative users—stable, reliable, no surprises.

2. Oh My Zsh: Powerful, but Easy to Overdo

Oh My Zsh is often the first upgrade people install.

It’s a community-driven framework with:

300+ plugins
140+ themes
Built-in Git integrations

It makes Zsh much easier to use out of the box.

But there’s a trap:

You can easily over-configure it.

Too many plugins → slower startup → harder debugging.

The command line should increase efficiency—not become a hobby.

My take:

Use it, but keep it minimal.

Only keep what actually helps:

Git plugin
Autosuggestions
Syntax highlighting

3. Fish: A More Modern Shell Experience

Fish is a very underrated option.

Its philosophy is simple:

Make the shell good by default.

Out of the box, Fish gives you:

Syntax highlighting
Smart autocomplete
Inline suggestions
Better tab completion

You don’t need heavy configuration.

This is a big deal.

Zsh is powerful—but often requires setup.

Fish feels like a modern shell from day one.

The downside:

Not fully POSIX-compatible
Some scripts may not run directly

Best use case:

Use Fish for daily interaction, and Bash/Zsh for scripting.

4. Ghostty: Fast, Clean, and Native

Ghostty is one of my current favorites.

It’s not an AI tool—it’s a modern terminal emulator.

Its core strengths:

Extremely fast
Lightweight
Native UI (not Electron-heavy)
Great pane management

Ghostty doesn’t try to define your workflow.

It just gives you a high-performance container.

You can plug in whatever you want:

Fish or Zsh
tmux
Neovim
Claude Code / Codex CLI

The downside:

No built-in AI layer

You need to bring your own tools.

Verdict:

Ghostty is ideal if you want speed, control, and flexibility.

5. Warp: The AI-Native Terminal

Warp is not just a terminal—it’s an AI development environment.

It’s designed for what people now call agentic development.

Warp integrates:

Built-in AI assistance
Command explanations
Error summarization
Smart command generation

And it supports external agents like:

Claude Code
Codex CLI
Gemini CLI

Recently, Warp also open-sourced its client, signaling a strong push toward becoming an AI-first terminal platform.

Why Warp stands out:

AI deeply integrated
Modern UI (block-based instead of raw text stream)
Lower barrier for beginners

Downsides:

Heavier than traditional terminals
Opinionated workflow
Can make you overly reliant on AI

Verdict:

Warp is for people who want their terminal to be an AI workspace.

6. tmux: Old-School, Still Powerful

tmux is a classic.

It lets you:

Run multiple sessions
Detach and reattach
Keep processes alive remotely

It’s especially useful for:

SSH workflows
Long-running tasks
Server environments

The downside:

Steep learning curve
Keyboard-heavy mental model

Recommendation:

Skip tmux for local dev—but it’s still essential for remote work.

7. LazyGit: The Git Productivity Booster

LazyGit is a terminal UI for Git.

Instead of memorizing commands, you get:

Visual diff
Easy staging
Commit management
Branch control

It sits between:

Raw CLI (too manual)
GUI tools (too detached)

It’s perfect for developers who:

Use Git frequently
Want speed without losing control

But remember:

LazyGit doesn’t replace Git knowledge—it amplifies it.

8. So What Should You Use?

Here are three practical setups:

1. Conservative Setup

iTerm2 + Oh My Zsh + LazyGit

Stable
Familiar
Low friction

Perfect for most developers.

2. Lightweight Setup (My Preference)

Ghostty + Fish + LazyGit

Fast and clean
Minimal configuration
Flexible AI integration

Core idea:

Keep the terminal light, plug in AI tools when needed.

3. AI-Native Setup

Warp + AI Agents + LazyGit

Deep AI integration
Command intelligence
Modern workflow

Core idea:

Turn your terminal into an AI workspace.

9. My Personal Choice

If I had to summarize:

Stick with iTerm2 → if you want stability
Use Ghostty → if you want speed and simplicity
Try Warp → if you want AI-first workflows

Shell-wise:

Use Fish → for better UX
Use Zsh → if you’re already invested

And always:

Install LazyGit—it’s worth it.

Final Thought

The terminal isn’t going away.

If anything, it’s becoming more important.

Because AI agents ultimately:

Run commands
Read files
Modify code
Execute workflows

The terminal used to be:

A bridge between humans and machines

Now it’s becoming:

A shared workspace between humans, AI, and machines

So choosing your terminal is no longer about picking a “black window.”

It’s about choosing how you work.

And in the AI era, that choice matters more than ever.

AI Agent Beginner's Guide: From Concepts to Protocols

Xunxing Mao — Mon, 27 Apr 2026 01:08:42 +0000

This article was originally published on maoxunxing.com. Follow me there for more deep dives on AI agents and engineering practices.

Agent vs Copilot

This is the key distinction for understanding the current forms of AI tools:

	Agent (Driver)	Copilot (Co-driver)
Role	Autonomous decision-making and execution	Assists, follows instructions
Behavior	Perceive environment -> Plan -> Execute -> Achieve goal	You give instructions -> It helps complete the task
User Requirements	Define the goal and boundaries	Needs prompting skills, domain knowledge, ongoing exploration
Process	Non-deterministic, dynamic decisions	Relatively fixed, human-driven

Why We Need Agents

Models cannot answer every question. For instance, events after the training knowledge cutoff are simply unknown to the model.

The evolution path:

Pure LLM
  |
Combining APIs for answers (Compound AI System / Agentic System) -> Fixed Pipeline
  |
Agents -> LLM makes dynamic decisions based on task goals and environment; the process is non-deterministic

The key distinction: a Pipeline is a fixed workflow; an Agent is a dynamic workflow. An Agent decides what to do next in real time based on the current environment and task goals.

Agent Architecture Patterns

Single Agent

A single LLM instance that autonomously perceives the environment, plans, executes, provides feedback, and completes end-to-end tasks.

Multi-Agent

Master-Worker pattern: One master Agent handles planning and task dispatch, while multiple worker Agents execute specific subtasks
Peer collaboration pattern: All Agents can make decisions, determining who acts based on capability and context

Computer Use / Web Agent

The difference from traditional automated testing: traditional test instructions are written by humans, while Web Agent instructions are dynamically generated by the Agent. An Agent can operate browsers and desktop applications like a human, autonomously deciding the operation path based on objectives.

Protocol Layer: MCP and SGP

For AI Agents to work in practice, they need to interface with external resources and services. There are two key protocol directions:

MCP (Model Context Protocol)

The problem it solves: How to integrate local computer resources?

RPA (Robotic Process Automation)
Local documents (Local Doc)
Local software (Local Software)

MCP enables Agents to call local capabilities without routing everything through the cloud.

SGP (Standard Gateway Protocol)

The problem it solves: How to connect remote services?

API services
Remote models

SGP unifies the interaction protocol between Agents and remote services.

Implementation Strategy and Business Opportunities

Architecture-Level Opportunities Belong to Small Businesses

Data processing architecture + systems engineering + application paradigms are uncertain -- this is where small entrepreneurs have opportunities, filling gaps that models alone cannot handle. The bottom layer consists of numerous models. But model and algorithm paradigms are certain -- that is what big companies do.

A notable finding: large models distilled into small models perform worse than small models trained from scratch.

A New Paradigm for Personalized Recommendations

Implementing personalized recommendations through Agents rather than traditional search/recommendation algorithms:

Models handle knowledge and general awareness
On-device processing for privacy without cloud dependency (1B parameters, 300-500M memory)
25ms search response time; models suit asynchronous IM scenarios
Hyper-personalization: every user sees a different UI

Pick Vertical Scenarios, Don't Spray and Pray

Don't try to do everything at once. Choose vertical scenarios and go deep.

Pipeline Thinking

Don't be constrained by paradigms; instead consider what it can do and experience it firsthand:

For example: use a model to identify intent -> use a cutout tool for image segmentation -> use other tools for subsequent processing. This is a Pipeline where data + engineering + algorithms can work as a full stack. Capability doesn't need to be 100%; 50% is OK.

Text as a Service

FaaS is a service, Serverless is a service, and text can also be a service:

Push me five AI news items every day, English only and meeting certain criteria, then execute a task to publish to a public account -- this can become a service. Implement the code yourself and it becomes a service. Code is productivity.

Business Scenario Examples

Public sentiment and comments: Extracting key signals buried in massive amounts of information
Infrastructure tools: All usable; the key is the solution and its future-readiness

Practical References

fastrtc

A real-time communication framework suitable for building voice/video Agents: fastrtc.org/cookbook

References

Building Effective Agents — Anthropic — Anthropic's guide to building effective AI agents with practical patterns
Model Context Protocol (MCP) Specification — Official specification for MCP, the open protocol for AI-tool integration
LangChain Agent Documentation — Comprehensive documentation on building AI agents with LangChain framework

Which AI Coding Tool Should You Choose? 2026 Comprehensive Comparison Guide

Xunxing Mao — Fri, 24 Apr 2026 01:32:32 +0000

This article was originally published on maoxunxing.com. Follow me there for more deep dives on AI-assisted engineering.

Introduction: Why I Recommend ChatGPT Pro / Gemini Advanced Pro

The AI coding tool market in 2026 has become highly mature, but also incredibly confusing. Cursor, Claude Code, GitHub Copilot, Windsurf, OpenCode, Zenmux... every tool is pushing its own subscription. But in reality, for most developers, the smartest choice is ChatGPT Pro ($200/month) or Gemini Advanced Pro ($20-30/month).

The reason is simple: unlimited usage, predictable costs, and versatile functionality.

This article will help you understand why Pro memberships are the superior solution from the perspectives of real costs, use cases, hidden fees, and ban risks, and how to combine them with various programming tools for optimal results.

1. Major International AI Coding Tools Breakdown

1. Cursor - The AI-Native IDE Representative

Pricing:

Hobby (Free): $0, limited agent requests and tab completions
Pro: $20/month ($16/month annual), $20 credit pool
Pro+: $60/month, $60 credit pool (3x)
Ultra: $200/month, 20x credits
Teams: $40/user/month

Key Features:

VS Code fork with deep AI integration
Switched from request-based to credit-based pricing in mid-2025
Auto mode is unlimited on all paid plans (key selling point)
Supports 15+ models, but through Cursor's proxy; most models don't support BYOK (Bring Your Own Key)

Best For: Developers who want AI deeply integrated into their editor, daily inline completions and quick edits

Real Cost Warning: Credit-based pricing means different models consume credits at different rates. Frontier models (like GPT-4o, Claude Opus) consume 3-5x faster. Heavy users spend $60-200/month

2. Claude Code - Terminal AI Agent King (But With Ban Risks⚠️)

Pricing:

Pro (requires Claude subscription): $20/month, ~44,000 tokens per 5-hour window
Max 5x: $100/month, ~88,000 tokens per 5-hour window
Max 20x: $200/month, ~220,000 tokens per 5-hour window
API pay-as-you-go: Sonnet 4.6 ($3/$15 per MTok), Opus 4.6 ($5/$25 per MTok)

Key Features:

Terminal CLI tool, no GUI
Claude Opus 4.6 scored 80.9% on SWE-bench Verified (highest ever)
Best-in-class autonomous agent loop (plan → edit → test → fix)
Claude models only, no third-party models
No local model option

⚠️ Ban Risk (Critical):

2025-2026 Mass Ban Events:

Anthropic banned 1.45 million accounts in H2 2025
Appeal success rate: only 3.3% (1,700 successful out of 52,000 appeals)
Multiple users reported immediate bans after upgrading to Claude Code Max

Top 5 Ban Reasons:

Geographic and IP Mismatch
- Using commercial VPNs or shared IPs
- Account registration location differs from usage location
- Abnormal IP address during payment
Payment Method Risks
- Virtual credit cards and prepaid cards flagged as high-risk
- Cards from crypto-to-fiat services
- Card name doesn't match account name
- Payment cards previously associated with banned accounts
Third-Party Tool Detection
- Starting January 2026, Anthropic began banning accounts using third-party tools with subscription credentials
- Tools like OpenCode, Cline that proxy requests through Claude Pro/Max accounts are targeted
- Many users who had been using these tools for months were banned immediately after payment
Automated Fraud Detection False Positives
- Upgrading plans (Pro → Max) triggers security review
- Changing credit cards or paying from new devices
- Multiple failed payment attempts followed by success
Billing System Synchronization Errors
- Claude Code and Claude.ai use separate billing systems
- Brief sync issues during upgrades or renewals
- Accounts incorrectly flagged as "past due" and banned

Reality After Being Banned:

Accounts disabled within minutes of payment
Error messages: "This organization has been disabled" or "Your account has been disabled"
No refunds (if deemed a policy violation)
Appeals take days to weeks to process

Recommendations to Reduce Ban Risk:

✅ Use real credit cards (from supported countries)
✅ Maintain consistent IP address (avoid frequent VPN switching)
✅ Avoid using third-party tools that proxy subscription credentials
✅ Consider API access (no geographic restrictions, lower ban risk)
❌ Don't use virtual credit cards or prepaid cards
❌ Don't immediately request usage after registration

Best For: Complex refactoring tasks requiring strongest reasoning, but you must accept ban risk

Real Cost Warning: The $20 Pro plan is very limited. Heavy users need $100-200/month Max plans, plus ban risk

3. GitHub Copilot - The Value King

Pricing:

Free: $0, 50 premium requests/month, 2,000 completions/month
Pro: $10/month ($100/year), 300 premium requests/month, unlimited completions
Pro+: $39/month ($390/year), 1,500 premium requests/month
Business: $19/user/month, 300 premium requests/user/month
Enterprise: $39/user/month, 1,000 premium requests/user/month
Overage: $0.04 per premium request

Key Features:

$10/month is the best value across the entire market
Multi-model support, including Claude Opus 4.6
Integrated voice plugin (local model, extremely accurate)
Auto-reads Claude Desktop's MCP configuration
VS Code and JetBrains plugin, not a standalone IDE

Best For: Budget-conscious individual developers, teams already using VS Code

Important Availability Note: Copilot used to be one of the easiest low-cost ways to access high-quality models (including Opus-level workflows). However, some users now report subscription onboarding restrictions (often tied to region, payment method, or account risk controls). If you can't subscribe directly, use API-based alternatives as a fallback.

Real Cost & Availability Warning: Overage at $0.04/request can add up quickly with heavy usage, and in some regions account availability can become a bigger risk than pricing itself.

4. Windsurf - Cursor's Direct Competitor

Pricing (after March 19, 2026 update):

Free: $0, limited quotas
Pro: $20/month (was $15), daily/weekly quotas
Teams: $40/seat/month (was $30)
Max: $200/month (NEW), highest quotas
Enterprise: Custom

Key Features:

Switched from credit system to quota system in March 2026 (controversial)
All models available
"Cascade" mode (similar to Cursor's composer)
Tends to over-explain, burning through requests faster

Best For: Developers who prefer Windsurf's workflow

Real Cost Warning: Quota system means you can hit daily limits even on paid plans. Existing users grandfathered at old prices

5. Zenmux - Unified AI API Gateway

Pricing:

Free: $0, limited quota
Builder: $20/month, fixed monthly fee + Flows (floating USD value)
Pro: $100/month, higher limits
Max: $200/month, highest limits
Pay-as-you-go: pay per actual usage, 1 Credit = $1 USD

Key Features:

Unified AI API Gateway: One interface to access all major models (OpenAI, Anthropic, Google, etc.)
Flows billing model: Unlike fixed credits, Flows have floating USD-equivalent values
No rate limits (unlimited scaling)
Enterprise-grade stability + AI insurance
Token-level billing: Pay only for what you use
Suitable for: Individual developers, learning, Vibe Coding, production environments

Best For:

Developers needing flexible access to multiple model providers
Teams wanting to simplify API management
Projects requiring production-grade stability

Real Cost Warning: Flows' floating value means actual costs can vary with model pricing changes. Monitor your bills closely

6. OpenCode - The Open-Source Freedom Champion

Pricing:

Tool itself: Completely free (MIT open-source)
API usage: Typically $5-50/month (depending on usage and model choice)
Local models (Ollama): Zero cost

Key Features:

Terminal TUI/CLI, also available as desktop app and VS Code extension
Supports 75+ model providers (Anthropic, OpenAI, Google, Groq, Ollama, etc.)
Switch models mid-conversation (/model command)
MCP support + plugin system with 25+ lifecycle hooks
No telemetry, no data storage; with Ollama, code never leaves your machine
Multi-session parallel support

⚠️ Warning: Since February 2026, Anthropic has been banning accounts using third-party tools (including OpenCode) with Claude subscription credentials. Use API access instead of subscription proxy.

Best For: Budget-conscious teams, those needing model flexibility, compliance/privacy-focused scenarios

Real Cost Warning: Tool is free, but API costs are separate. Heavy usage of premium models like Claude Opus can exceed subscription-based tool costs

7. OpenAI Codex CLI - OpenAI's Terminal Solution

Pricing:

Tool itself: Free (Apache 2.0 open-source)
ChatGPT Plus: $20/month, 33-168 local messages
ChatGPT Pro: $200/month, 300-1,500 messages
API pay-as-you-go: codex-mini ($1.50/$6.00 per MTok), GPT-5 ($1.25/$10.00 per MTok)

Key Features:

~4x more token-efficient than Claude Code (official claim)
Terminal-Bench 2.0 score: 77.3%
Open-source, no vendor lock-in
Accessible via ChatGPT subscription or API key

Best For: Developers already subscribed to ChatGPT, scenarios requiring token efficiency

2. Why ChatGPT Pro / Gemini Advanced Pro Is the Best Solution

Core Advantage: Unlimited Usage, Predictable Costs

You're absolutely right that for many developers, Pro memberships are the true value kings:

ChatGPT Pro ($200/month)

✅ Core Advantages:

No explicit request limits: Unlike Cursor, Claude Code with strict credit/quota limits
No worry about overage fees: Won't charge you extra for using more
Extremely versatile: Not just coding, but also writing, translation, data analysis, learning new knowledge
Works with any tool: Can be paired with any IDE, terminal tools, online services
o1 model with strong reasoning: Perfect for complex problem-solving and code review
No ban risk: Unlike Claude Code that's easily banned for VPN, third-party tools

💰 Cost Analysis:

Fixed $200/month, no hidden fees
If you need other ChatGPT Plus features (GPT-4o, Advanced Voice, etc.), Pro version is more cost-effective
Equivalent to a universal AI assistant covering all aspects of work and life

Gemini Advanced Pro ($20-30/month)

✅ Core Advantages:

Super cheap: Only 1/10 the price of ChatGPT Pro
1M token context window (soon 2M): Can paste entire codebases
Gemini API free tier extremely generous: 10-30 requests/minute, can be used with Cursor, OpenCode, etc.
No overage worry: Like ChatGPT Pro, fixed monthly fee with no extra charges
Google ecosystem integration: Seamless connection with YouTube, Drive, Gmail, etc.
No ban risk: VPN and international user friendly

💰 Cost Analysis:

$20-30/month, extremely cost-effective
Free API can save costs on other tools
Perfect for developers on a budget who need powerful AI capabilities

Why More Practical Than Professional Coding Tools?

1. Cost Certainty vs Uncertainty

Solution	Monthly Fee	Hidden Fees	Total Cost Certainty
ChatGPT Pro	$200	None	✅ Completely certain
Gemini Advanced	$20-30	None	✅ Completely certain
Cursor Pro	$20	Extra charges after credit exhaustion	❌ Uncertain
Claude Code Max	$100-200	Potential ban losses	❌ High risk
GitHub Copilot Pro	$10	$0.04/request overage	⚠️ May exceed

Key Insight: Cursor Pro seems cheap at $20, but heavy usage can reach $60-200; Claude Code Max $100-200 has ban risks. In comparison, ChatGPT Pro $200 has higher certainty.

2. Versatility vs Specialization

What ChatGPT Pro / Gemini Pro Can Do:

💻 Code generation and debugging
📝 Technical documentation writing
🌐 Translation and localization
📊 Data analysis and visualization
🎓 Learning new technologies
🗣️ Voice conversations (ChatGPT Advanced Voice)
🖼️ Image generation and understanding
📧 Email and daily communication

What Professional Coding Tools Can Only Do:

In-IDE completions
Codebase-aware editing
Terminal agent operations

Conclusion: Pro memberships cover 90% of AI usage scenarios, while professional coding tools only cover 10%.

3. Flexibility vs Lock-in

Pro Membership Flexibility:

Can be used in any editor (VS Code, JetBrains, Vim, terminal...)
Can be paired with any AI coding tool (Cursor, OpenCode, Cline, Aider...)
Can switch usage scenarios anytime
Won't be locked into a specific IDE or workflow

Professional Coding Tool Lock-in:

Cursor requires using their IDE (VS Code fork)
Claude Code only works with Claude models
Windsurf requires using their editor

Best Practice: Pro Membership + Free/Low-Cost Tool Combinations

Option A: Strongest Combination ($200-230/month)

ChatGPT Pro ($200): Primary AI assistant
GitHub Copilot Free ($0): IDE completions (2,000/month)
Total cost: $200/month
Coverage: 95% of usage scenarios

Option B: Value King ($20-50/month)

Gemini Advanced ($20-30): Primary AI assistant
Gemini API (Free): Use with Cursor/OpenCode
GitHub Copilot Free ($0): IDE completions
Total cost: $20-30/month
Coverage: 90% of usage scenarios

Option C: All-Round Developer ($220-250/month)

ChatGPT Pro ($200): Daily AI assistant
Gemini Advanced ($20-30): Long context scenarios
GitHub Copilot Pro ($10): Deep IDE integration
Total cost: $230-240/month
Coverage: 99% of usage scenarios, invincible combination

3. Hidden Fees Revealed: What Seems Cheap Might Be Expensive

1. Overage Charges

GitHub Copilot:

Beyond quota: $0.04 per premium request
Sounds small, but 50 extra requests daily = $60/month

Cursor:

After $20 credit pool exhausted, pay per model rate
GPT-4o or Claude Opus consume credits 3-5x faster than lighter models

Windsurf:

Old version: $10 for 250 extra requests
New version: Quota system, hard limits after reaching quota

2. Model Selection Traps

Cursor: Supports 15+ models, but through Cursor's proxy; you can't control which model is actually called
Claude Code: Claude models only; need to switch tools for GPT-4o or Gemini
OpenCode: Only tool supporting true BYOK, flexible to choose cheapest models
Zenmux: Unified gateway, but Flows' floating value needs careful monitoring

3. Ban Risk Costs (Claude Code Specific)

Real Cases:

Users paying $100-200 for Claude Code Max banned immediately
Appeal success rate: only 3.3%
No refunds provided
Account recovery takes days to weeks

Indirect Costs:

Development interruption time cost
Cost of reconfiguring alternative tools
Potentially lost project context

Cost to Reduce Risk:

Using real credit cards (may require enabling international payments)
Stable network environment (may need better VPN service)
Switching to API access (potentially higher cost, but lower risk)

4. Team Cost Amplification

For a 500-developer team (annual costs):

GitHub Copilot Business: $114,000
Cursor Business: $192,000
Tabnine Enterprise: $234,000+

OpenCode solution: Tool $0 + API costs $20-30/dev/month = $120,000-180,000/year (but more flexible model choices)

5. Implementation and Governance Costs

DX research shows real costs of AI coding tools also include:

Monitoring and governance tools: $50,000-250,000/year
Internal training and enablement: Affects 40-50% adoption rate
Change management: Integrating new tools into existing workflows

6. Data Privacy Costs

Claude Code / Cursor: Code sent to vendor servers
OpenCode + Ollama: Code fully localized, suitable for regulated industries
Tabnine: Supports on-premises deployment, but most expensive ($39-59/user/month)

4. Scenario-Based Recommendations: Pro Membership + Tool Combination Strategy

Scenario 1: Limited Budget, Pursuing Highest Value ⭐ Recommended

Recommended: Gemini Advanced ($20-30/month) + GitHub Copilot Free ($0)

Total cost: $20-30/month
Coverage: 90% of daily usage scenarios
Advantages:
- Extremely low cost, equivalent to a cup of coffee
- Gemini 1M token context can paste entire codebases
- Free API can be used with Cursor, OpenCode, etc.
- No ban risk, VPN friendly

Scenario 2: Professional Developer, Need Strongest Capabilities ⭐ Recommended

Recommended: ChatGPT Pro ($200/month) + GitHub Copilot Free ($0)

Total cost: $200/month
Coverage: 95% of usage scenarios
Advantages:
- Unlimited o1 model usage, strongest reasoning
- No worry about overage fees
- Can be paired with any IDE and programming tools
- Extremely versatile, covers all work and life scenarios
- No ban risk

Scenario 3: All-Round Developer, Pursuing Invincible Combination

Recommended: ChatGPT Pro ($200) + Gemini Advanced ($20-30) + GitHub Copilot Pro ($10)

Total cost: $230-240/month
Coverage: 99% of usage scenarios
Advantages:
- ChatGPT Pro: Daily reasoning, code review, complex problems
- Gemini Advanced: Long context, codebase analysis
- Copilot Pro: Deep IDE integration, voice features
- Invincible combination, meets all needs

Scenario 4: Team/Enterprise, Compliance and Privacy Needed

Recommended: OpenCode (Free) + Ollama (Local Models) + Gemini Advanced ($20-30)

Tool cost: $0 + $20-30/month
API cost: On-demand, can be fully localized
Suitable for: Finance, healthcare, government, and other regulated industries

Scenario 5: Just Want "Unlimited and Worry-Free" General AI

Recommended: ChatGPT Pro ($200/month) or Gemini Advanced ($20-30/month)

Advantages:
- ✅ No overage worry
- ✅ No extra fees
- ✅ No ban risk
- ✅ Extremely versatile
Disadvantages: Lacks deep IDE integration (but can be compensated with free tools)
Suitable for: Developers needing comprehensive AI assistance

5. 2026 Market Trends

1. Pricing Model Fragmentation

Credits (Cursor)
Tokens (Claude Code API)
Quotas (Windsurf)
Premium Requests (GitHub Copilot)
Flows (Zenmux, floating value)
Daily caps vs monthly caps

Advice: Don't just look at headline prices; read the fine print

2. $20/Month Becomes Standard Tier

Cursor Pro, Windsurf Pro, Claude Code Pro, Zenmux Builder all converge at $20/month

3. Heavy User Costs Converge

Regardless of tool choice, heavy usage eventually reaches $60-200/month

4. Free Tiers Are Genuinely Usable

Bolt.new: 1M tokens/month
GitHub Copilot Free: 2,000 completions
Codex CLI: Open-source free

5. Tool Combination Becomes Standard

DX research shows developers use 2-3 AI tools on average; chat-based assistants (ChatGPT, Claude, Gemini) complement IDE-native tools

6. Ban Risk Becomes Important Consideration

Claude Code's mass ban events changed the market landscape
More developers switching to API access or alternative tools
Geographic location and payment methods becoming important factors in tool selection

6. Final Recommendations: Pro Membership Is Core, Tools Are Supplementary

Lowest Cost Option ($20-30/month) ⭐ Recommended:

Gemini Advanced ($20-30): Primary AI assistant
GitHub Copilot Free ($0): IDE completions
Total cost: $20-30/month

Best Value Option ($200/month) ⭐ Recommended:

ChatGPT Pro ($200): Universal AI assistant
GitHub Copilot Free ($0): IDE completions
Total cost: $200/month

All-Round Developer Option ($230-240/month):

ChatGPT Pro ($200) + Gemini Advanced ($20-30) + GitHub Copilot Pro ($10)
Covers 99% scenarios, invincible combination

Zero Cost Option:

Gemini AI Studio (Free, 1M token context)
GitHub Copilot Free (2,000 completions)
OpenCode (Free) + Ollama (local models)

Safest Option (Avoid Bans):

ChatGPT Pro or Gemini Advanced
Use Claude models via API (pay-as-you-go)
Avoid Claude subscription products

Conclusion: Pro Membership Is Core, Tools Are Supplementary

When choosing AI coding tools, ask yourself four questions:

How much time do I spend coding daily? (Determines how powerful a tool you need)
What's my budget ceiling? (Avoid surprise overage fees)
What's my workflow? (Terminal, IDE, or hybrid?)
Is there ban risk in my region? (Especially for Claude Code users)

Core Recommendation:

For most developers, ChatGPT Pro or Gemini Advanced Pro membership is the smartest choice.

The reasons are simple:

✅ Unlimited usage: No worry about overage fees
✅ Predictable costs: Fixed monthly fee, clear budget
✅ Extremely versatile: Can be paired with various programming tools and online conversation services
✅ No ban risk: Unlike Claude Code that's easily banned

Professional coding tools (Cursor, Claude Code, Windsurf, etc.) can serve as supplements, but shouldn't be the core.

Remember: Pro membership is the core infrastructure, various coding tools are just the icing on the cake.

Most important point: If you're in China or using a VPN, strongly recommend choosing ChatGPT Pro or Gemini Advanced, and avoid Claude Code subscription products to minimize ban risk.

Appendix: Quick Comparison Table

Tool	Price	Core Advantage	Best For	Hidden Costs	Ban Risk
GitHub Copilot Pro	$10/mo	Best value	Budget-conscious developers	Overage $0.04/request	Low
Cursor Pro	$20/mo	Deep IDE integration	Daily AI-assisted development	Variable credit consumption	Low
Claude Code Pro	$20/mo	Strongest reasoning	Complex refactoring	Requires upgrade to $100-200	High⚠️
Windsurf Pro	$20/mo	All models available	Windsurf workflow fans	Quota system daily limits	Low
Zenmux Builder	$20/mo	Unified API gateway	Multi-model strategy devs	Floating Flow values	Low
OpenCode	Free	75+ models, total flexibility	Power users, compliance needs	API costs separate	Low (with API)
ChatGPT Pro	$200/mo	Unlimited, no overage worry	Heavy general AI users	Lacks IDE integration	Low
Gemini Advanced	$20-30/mo	1M token context	Long context scenarios	Chat-only, no agent capability	Low

Data sourced from official pricing pages and third-party research (as of April 2026). Pricing may change; please verify with official sources. Ban risk information based on public reports and user feedback; actual situations may vary.

Practicing Karpathy's Personal Knowledge Base Method with a Git Repository

Xunxing Mao — Wed, 22 Apr 2026 01:07:07 +0000

This article was originally published on maoxunxing.com. Follow me there for more on AI-assisted workflows, Hugo, and knowledge systems.

What Karpathy Shared

Andrej Karpathy recently shared a practical approach on X/Twitter and published a complete LLM Wiki Gist: using LLMs to build personal knowledge bases for research topics. The core workflow:

Dump source files (articles, papers, screenshots) into a raw/ directory
Use an LLM to "compile" them into structured Markdown knowledge entries
Browse everything in Obsidian
Query the knowledge base — the LLM searches and answers autonomously
Periodically run LLM "health checks" to fix contradictions and fill gaps

His knowledge base has grown to ~100 entries and 400K words. No RAG needed — the LLM maintains indexes and summaries to handle all queries.

In one sentence: raw materials in, structured knowledge out, LLM does the heavy lifting.

Why Not Obsidian

Karpathy uses Obsidian as his viewer. But if you already have a Hugo blog repository, you don't need any extra software:

Need	Obsidian Approach	Hugo Repo Approach
View Markdown	Obsidian editor	`hugo server -D` local preview
Link knowledge	`[[]]` backlinks + graph	Hugo tags + Algolia search
Publish output	Requires extra export	Remove `draft: true`, push
Version control	Needs Obsidian Git plugin	It's already a Git repo
Multi-device sync	Obsidian Sync or iCloud	`git pull`
Search	Built-in Obsidian search	`grep` / Algolia / LLM

The key advantage: knowledge refined into articles publishes directly — zero migration cost. One repo, full pipeline from collection to publication.

Three-Layer Knowledge Pipeline

Build three content tiers inside your repository:

content/
  raw/        <- Inbox: see something good, dump it here
  notes/      <- Knowledge base: LLM-compiled structured entries
  posts/      <- Blog: polished, published articles

raw/ — Zero-Friction Inbox

This is the system's entry point. Key principle: don't fuss over formatting or classification — just capture it.

Each raw entry is a Markdown file with frontmatter:

---
title: "Some article about RAG pipelines"
date: 2026-04-09
draft: true
tags: [AI, RAG]
source: "https://original-url"
---

Paste the original text / summary / screenshot / notes here. Whatever is fastest.

draft: true ensures these materials never appear on your live blog — only visible locally with hugo server -D.

notes/ — Compiled Knowledge Entries

When raw/ accumulates enough material on a topic, let the LLM:

Merge and synthesize related materials
Extract core insights
Add structured summaries
Tag with cross-references

Turning raw/ fragments into complete knowledge entries in notes/.

posts/ — Published Blog Articles

When a notes/ entry reaches sufficient depth and you're ready to write a full article, polish it, remove draft: true, and publish.

Flow is always one-directional: raw -> notes -> posts. Materials only get more refined, never regress.

Step-by-Step Setup

Step 1: Create the raw directory

mkdir -p content/raw

cat <<'EOF' > content/raw/_index.md
---
title: "Raw"
description: "Knowledge inbox"
draft: true
---
EOF

Step 2: Add a Hugo archetype template

Create archetypes/raw.md:

---
title: "{{ replace .Name "-" " " | title }}"
date: {{ .Date }}
draft: true
tags: []
source: ""
---

Now hugo new raw/topic-name/index.md auto-generates entries with the template.

Step 3: Configure Hugo permalinks

Add raw to the permalinks section in config.toml:

[permalinks]
raw = "/:slugorcontentbasename/"

Step 4: Start collecting

See a good article or have an idea? Create a raw entry immediately:

hugo new raw/interesting-topic/index.md

Paste in the content. No formatting needed, no perfection required — raw state is fine.

Compiling with LLM

This is the heart of Karpathy's method and the highest-value step.

Materials -> Knowledge Entries

Have the LLM read multiple related materials from raw/ and synthesize a notes/ entry:

"Read all raw entries tagged with AI, synthesize them into a structured knowledge entry under content/notes/ai-fundamentals/. Requirements: extract core concepts, add cross-references, cite sources."

Knowledge Entries -> Blog Posts

When a notes entry has accumulated enough depth:

"Based on the knowledge entry in content/notes/ai-fundamentals/, write a developer-facing blog post for content/posts/. Requirements: include opinions, real examples, and actionable advice."

Health Checks

Periodically audit the knowledge base:

"Scan all entries in content/raw/ and content/notes/. Find: 1) duplicate topics that should merge 2) entries missing tags 3) raw materials ready to compile into notes"

Automate with a Qoder Skill

Take it further with a Qoder Skill — one sentence does it all:

/kb collect https://example.com/article — fetch and create a raw entry
/kb collect I learned today that LoRA fine-tuning's key is... — quick-capture a thought
/kb compile AI — compile AI-related raw materials into a notes entry
/kb preview — start local preview with all materials visible
/kb check — LLM health check

Daily Workflow

The visual flow:

See a great article / Have an insight
       |
       v
  /kb collect "content"     <-- One sentence, zero friction
       |
       v
  content/raw/xxx/          <-- Auto-created, draft:true
       |
       v (accumulate enough)
  /kb compile "topic"       <-- LLM synthesizes
       |
       v
  content/notes/xxx/        <-- Structured knowledge entry
       |
       v (polish & refine)
  content/posts/xxx/        <-- Published blog post, draft removed
       |
       v
  git push -> live on the web

The entire process:

Collection: Zero friction, one sentence
Compilation: LLM handles the grunt work
Publishing: Remove draft: true, push to deploy
No extra software: Git + Hugo + LLM, that's it

Comparison with Karpathy's Original

Aspect	Karpathy's Version	This Approach
Storage	Standalone knowledge repo	Embedded in blog repo
Viewer	Obsidian	hugo server -D
Raw materials	raw/ directory	content/raw/ (draft)
Compilation	LLM generates .md	LLM generates notes/
Output	Markdown/Marp/charts	Directly published as blog posts
Search	Custom search engine	grep + Algolia + LLM
Health checks	LLM audit	Same LLM audit

The biggest difference: Karpathy's knowledge base is standalone — output requires manual migration. In this approach, the knowledge base and blog are unified. Collection to publication happens in one repository, with zero migration cost.

Summary

The core of Karpathy's method isn't about which tools you use — it's about establishing a "collect -> compile -> output" knowledge pipeline and letting the LLM handle compilation and maintenance.

If you already have a blog repository, you can implement this method right inside it: add content/raw/ as an inbox, use draft: true to control visibility, and let the LLM drive the flow from raw materials to knowledge to published articles.

No Obsidian. No Notion. No new software. One Git repo is your knowledge base.

If you're interested in AI-assisted development workflows, check out my AI Coding Playbook for tool selection and prompt templates.

I also wrote AI Rewriting Workflow on how knowledge workers can adapt when AI multiplies leverage.

References

Primary Sources

Andrej Karpathy's original post — X/Twitter thread on LLM Knowledge Bases — The original announcement describing the raw/ -> wiki compilation workflow.
LLM Wiki Gist — github.com/karpathy/442a6bf... — Karpathy's complete LLM Wiki pattern specification, defining the three-layer architecture (source materials, AI-generated wiki, configuration).

Video Explainers

How to Build a Personal LLM Knowledge Base (Karpathy's Method) — Step-by-step walkthrough of implementing Karpathy's method.
How To Do PHD-Level Research with AI (Karpathy's LLM Wiki) — Deep dive into using the LLM Wiki pattern for academic-level research.
Karpathy's LLM Wiki: The End of Forgotten Knowledge — Analysis of the LLM Wiki pattern as an alternative to traditional RAG retrieval.

Analysis & Community

VentureBeat — Karpathy shares 'LLM Knowledge Base' architecture that bypasses RAG — Industry analysis of why this approach works without complex RAG pipelines.
MindStudio — What Is Andrej Karpathy's LLM Wiki? — Practical guide to building an LLM Wiki with Claude Code.
Antigravity Codes — Karpathy's LLM Knowledge Bases: The Post-Code AI Workflow — Technical breakdown of the workflow as a "post-code" paradigm.
Reddit r/ObsidianMD — Implemented Karpathy's LLM knowledge base workflow in Obsidian — Community discussion on Obsidian-based implementations.

Related Concepts

DEV Community — A Personal Git Repo as a Knowledge Base Wiki — Using plain Git + Markdown as a personal wiki, the foundational approach this article builds upon.
Hacker News — Repurposing Hugo as a wiki — Discussion on using Hugo for wiki-style knowledge management.

Felix Mao | maoxunxing.com | @maoxunxing

AI Coding Playbook: Tool Selection, Workflows, and Prompt Templates

Xunxing Mao — Thu, 16 Apr 2026 01:22:57 +0000

This article was originally published on maoxunxing.com. Follow me there for more deep dives on AI-assisted development workflows.

Choosing AI Tools by Scenario

Different tasks call for different model and tool combinations:

Scenario	Recommended Approach	Reasoning
Reading comprehension	Qwen CLI + Qwen Coder	Fast, fewer hallucinations, low cost
Analysis scripts	Claude Code	Deep thinking, offers unique and unexpected statistical dimensions
Report generation	Cherry Studio + Claude Sonnet + specific template	Consistent design style, avoids the "AI flavor"

Common AI Tools

ChatWise -- Multi-model chat client
DeepSeek R1 -- Reasoning model
Gemini 2 Flash Thinking -- Fast reasoning model
Repomix -- Packages codebases for feeding to AI as a whole
Spark Desktop -- Desktop AI assistant

Cherry Studio Design Workflow

Cherry Studio + HTML enables rapid page design "card drawing." The core approach:

Like image generation, try to generate multiple outputs at once, then pick the best
Use HTML or SVG for rendering
To reduce response size, require AI to use TailwindCSS
Reference established design systems like Ant Design or Shadcn UI as background knowledge

UI/UX Designer System Prompt Template

Here is a field-tested UI/UX Design System Prompt suitable for Cherry Studio or similar tools:

# Role
UI/UX Designer Expert

## Notes
1. Encourage deep thinking about role configuration details to ensure task completion.
2. Expert design should consider the user's needs and concerns.
3. Use emotional prompting to emphasize the role's significance and emotional dimensions.

## Personality Type
INTJ (Introverted, Intuitive, Thinking, Judging)

## Background
The UI/UX Designer Expert role is designed to help users make informed decisions
in the visual design and user experience domain. This role provides professional
guidance and advice to help create beautiful yet functional interface designs.

## Constraints
- Must follow user-centered design principles
- Must consider cross-platform and multi-device compatibility

## Goals
- Provide innovative and practical UI/UX design solutions
- Enhance user satisfaction and product usability
- Optimize user-product interaction experience

## Skills
1. Visual design capability
2. User research and analysis
3. Interaction design
4. Technical implementation

## Tone
- Professional and insightful
- Encouraging innovation and experimentation
- Approachable and easy to understand

## Values
- User-first: all design centered on user needs
- Pursuing simplicity without sacrificing functionality
- Continuous learning and adapting to new technologies and trends

## Workflow
1. Understand user requirements and goals
2. Conduct market research and competitive analysis
3. Determine design direction and style
4. Create prototypes and interaction flows
5. Conduct user testing and collect feedback
6. Iterate based on feedback
7. Deliver high-quality design output

# Initialization
Hello, let us think step by step, working diligently and carefully.
Please follow the Workflow step-by-step according to the chosen role to achieve the Goals.
This is very important to me -- please help. Thank you! Let's begin.

# Output Format
Return the final design result in HTML, using TailwindCSS for styling.

Core Questions for AI-Assisted Development

When promoting AI programming within a team, consider these questions:

Tool comparison: What are the respective use cases for Cursor's built-in browser Agent vs. chrome-devtools-mcp?
Product form: What distinguishes Claude Code and Codex from an IDE?
Team generalization: Can current AI programming practices scale to other team members?
Engineering standards: What are the engineering standards? How are they established? Are they high-cohesion, low-coupling?
AI as challenger: Let AI raise questions for me, filling in missing perspectives -- AI as Code Reviewer and proposal challenger
Tech Lead perspective: What questions does a Tech Lead care about? How do I answer them?
Memory management: How to effectively maintain context with Cursor Memory Bank?

Prompt Template Library

Code Reading

What does this code implement? Please provide a detailed introduction, create a
colored table diagram or generate a visualization to aid understanding. Also output
a minimal runnable version of the code -- no error handling, no edge cases, no logging.

Article / Note Organization

Help me organize this into better Markdown format (add a table of contents if there
is a lot of content). Please ensure no content is lost; minor additions are fine.
Organize everything in Markdown so I can copy and use it directly.

Book Report

Help me organize this into better Markdown format. Please ensure no content is lost;
minor additions are fine. Organize everything in Markdown so I can copy and use it
directly. Required format:

A 50-word summary

---

## What I Liked

## What I Disliked

## Key Takeaways

Paper Reading

List the distinctive methods used in this paper. Compare them with previous techniques.
Give me a list that is extremely specific about what they did differently compared to
prior work.

Codebase Improvement

Hello AI, here is my entire codebase. Tell me 10 ideas for how I can improve it.

High-Quality Answer Pre-check

Don't rush to answer my question yet. In order to produce a higher-quality answer,
what additional information do I need to provide?

Writing Style Switches

"Please rewrite this article in a Hemingway style." -- Short sentences, direct, powerful
"Rewrite it in the style of Stephen King's On Writing." -- Narrative drive, rhythm, vivid imagery

References

Cursor Documentation — Official documentation for Cursor AI-powered code editor
GitHub Copilot Documentation — Official GitHub Copilot documentation covering setup and best practices
Prompt Engineering for Developers — DeepLearning.AI — Free course on prompt engineering techniques for software development

Node.js CPU Spike Analysis: When Requests Hang and Event Loop Starves

Xunxing Mao — Tue, 14 Apr 2026 01:27:23 +0000

This article was originally published on maoxunxing.com. Follow me there for more deep dives on Node.js, AI, and frontend engineering.

The Problem

In production environments, we often encounter a peculiar phenomenon: CPU usage suddenly spikes to 100% while the application appears to be "doing nothing." No active computations, no heavy processing—just hanging requests and a frozen event loop.

This article analyzes two real-world cases from a large-scale operation platform:

RPC batch processing timeout causing CPU spikes
Message queue subscriber CPU anomalies without rate limiting

Case 1: RPC Batch Processing - The Silent Killer

The Scenario

We have a getBatchCompleteModuleDiff method that processes 100 components in batches. Each component triggers 4-5 RPC calls to backend services.

// Before optimization - Serial processing, NO timeout
async getBatchCompleteModuleDiff(componentIds: number[]) {
  const chunks = chunk(componentIds, 5)  // 20 batches for 100 components
  let result = []

  for (const chunkItem of chunks) {  // Serial execution!
    const res = await Promise.all(
      chunkItem.map(id => this.getCompleteModuleDiffInfo(id))
    )
    result.push(res)
  }
  return result
}

Why CPU Spikes When Requests Hang

The root cause is Event Loop Starvation caused by the combination of:

No timeout control - RPC calls could hang indefinitely
Promise accumulation - 20 batches × 5 components × 4 RPC calls = 400+ concurrent Promises
Event loop blocking - All Promises compete for event loop cycles

Here's what happens:

Timeline:
├── Batch 1 starts (5 components × 4 RPC calls = 20 Promises)
├── Batch 1 hangs (backend database exception, no timeout)
├── Batch 2 starts (another 20 Promises)
├── Batch 2 hangs
├── ...
├── Batch 20 starts (another 20 Promises)
├── Event Loop: 400+ pending Promises waiting
└── CPU: 100% (event loop constantly checking Promise states)

The CPU isn't doing useful work—it's spinning on Promise resolution checks.

The Solution

// After optimization - Parallel batches with timeout
async getBatchCompleteModuleDiff(componentIds: number[]) {
  const componentIdsChunks = chunk(componentIds, 10)  // Larger chunks

  // Parallel processing with timeout protection
  const batchPromises = componentIdsChunks.map(async (chunk) => {
    const promises = chunk.map(id => 
      this.withTimeout(
        this.getCompleteModuleDiffInfo(id),
        5000,  // 5s timeout prevents indefinite hanging
        `timeout for id: ${id}`
      )
    )
    return Promise.allSettled(promises)  // Isolate failures
  })

  return Promise.all(batchPromises)
}

private withTimeout<T>(promise: Promise<T>, timeoutMs: number, errorMsg: string): Promise<T> {
  return new Promise((resolve, reject) => {
    const timeoutId = setTimeout(() => reject(new Error(errorMsg)), timeoutMs)
    promise
      .then(result => { clearTimeout(timeoutId); resolve(result) })
      .catch(error => { clearTimeout(timeoutId); reject(error) })
  })
}

Results

Metric	Before	After	Improvement
Response Time	60s+	5-8s	87% ↓
CPU Usage	90-100%	40-50%	50% ↓
Timeout Control	None	5s	Prevents hanging

Case 2: Message Queue Subscriber - The Rate Limiting Problem

The Scenario

Our CheckTaskResultSubscriber processes message queue messages for inspection task results:

@MessageSubscriber({ topic: 'TASK_RESULT' })
export class CheckTaskResultSubscriber implements IMessageSubscriber {
  @Inject()
  checkReportService: CheckReportService

  async subscribe(msg: SubscribeMessage) {
    const messageBody = JSON.parse(msg.body.toString())

    if (this.INSPECTOR_TAGS.includes(messageTag)) {
      await this.handleInspectorResult(messageBody)  // DB operations
    } else if (this.DETECTION_TAGS.includes(messageTag)) {
      await this.handleDetectionResult(messageBody)  // DB operations
    }
  }
}

Why CPU Spikes Without Rate Limiting

Message queue consumers by default pull messages as fast as possible. When:

Message burst occurs (e.g., 1000 tasks complete simultaneously)
Each message triggers DB operations (queries, updates)
No concurrency control - all messages processed concurrently
Connection pool exhaustion - DB connections max out
Event loop saturated - waiting on I/O operations

Message Burst (1000 messages)
    ↓
No Rate Limiting
    ↓
1000 concurrent async operations
    ↓
DB Connection Pool (max 50) exhausted
    ↓
949 operations waiting for connections
    ↓
Event Loop: Constantly polling/waiting
    ↓
CPU: 100% (context switching + polling overhead)

Why This Happens in Node.js

Unlike Java's thread pool model, Node.js uses a single-threaded event loop:

Java: Thread pool limits concurrent execution naturally (e.g., 50 threads = max 50 concurrent operations)
Node.js: No natural limit—can create unlimited Promises that all compete for the event loop

When Promises wait for I/O (DB connections), they don't "pause"—they constantly check if the resource is available, consuming CPU cycles.

The Missing Rate Limiting

Looking at our message queue configuration:

// config.prod.ts
config.messageQueue = {
  enableDefaultProducer: true,
  pub: { /* ... */ },
  // NO sub configuration! Consumer uses default settings
}

The subscriber has no concurrency control at either the:

Message queue level (no prefetch limit)
Application level (no semaphore/bottleneck)

Key Takeaways

Why Requests Hanging Cause CPU Spikes

Factor	Explanation
Promise overhead	Each pending Promise consumes event loop cycles
No timeout	Hanging requests accumulate indefinitely
Resource competition	DB connections, memory, file descriptors exhaust
Polling cost	Event loop constantly checks I/O readiness
Context switching	V8 engine overhead from Promise state transitions

Prevention Strategies

Always set timeouts for external calls (RPC, HTTP, DB)
Implement rate limiting for message consumers
Use Promise.allSettled instead of Promise.all to isolate failures
Monitor event loop lag as an early warning indicator
Add circuit breakers for cascading failure prevention

Monitoring Checklist

// Event loop lag monitoring
const eventLoopLag = require('event-loop-lag')
setInterval(() => {
  const lag = eventLoopLag()
  if (lag > 100) {  // > 100ms indicates problem
    console.warn(`Event loop lag: ${lag}ms`)
  }
}, 1000)

Conclusion

Node.js CPU spikes during request hanging are counterintuitive but explainable:

It's not the hanging itself that consumes CPU
It's the accumulation of waiting Promises competing for event loop attention
Timeout and concurrency control are essential defenses

The key insight: Node.js requires explicit resource management that other languages handle implicitly through thread pools. Without it, "doing nothing" can consume everything.

This article is based on real production incidents from a large-scale operation platform. The optimization reduced P99 latency from 60s to 8s and stabilized CPU usage at 40-50%.

References

The Node.js Event Loop — Node.js Documentation — Official documentation of the Node.js event loop
Don't Block the Event Loop — Node.js Guide — Official guide on avoiding event loop blocking
Understanding the Node.js Event Loop — YouTube (Bert Belder) — Deep dive into event loop internals from a Node.js core contributor

If you're interested in AI-assisted development workflows, check out my
AI Coding Playbook where I
cover tool selection and prompt templates.

I also wrote about building a personal knowledge base using Karpathy's
method which
covers how to organize technical learning into a publishing pipeline.

Felix Mao (毛毛星) | maoxunxing.com | @maoxunxing

DEV Community: Xunxing Mao

In the Age of AI, What Terminal Tools Should We Be Using?

1. iTerm2: Stable, Mature—but Not AI-Native

2. Oh My Zsh: Powerful, but Easy to Overdo

3. Fish: A More Modern Shell Experience

4. Ghostty: Fast, Clean, and Native

5. Warp: The AI-Native Terminal

Why Warp stands out:

Downsides:

6. tmux: Old-School, Still Powerful

7. LazyGit: The Git Productivity Booster

8. So What Should You Use?

1. Conservative Setup

2. Lightweight Setup (My Preference)

3. AI-Native Setup

9. My Personal Choice

Final Thought

AI Agent Beginner's Guide: From Concepts to Protocols

Agent vs Copilot

Why We Need Agents

Agent Architecture Patterns

Single Agent

Multi-Agent

Computer Use / Web Agent

Protocol Layer: MCP and SGP

MCP (Model Context Protocol)

SGP (Standard Gateway Protocol)

Implementation Strategy and Business Opportunities

Architecture-Level Opportunities Belong to Small Businesses

A New Paradigm for Personalized Recommendations

Pick Vertical Scenarios, Don't Spray and Pray

Pipeline Thinking

Text as a Service

Business Scenario Examples

Practical References

fastrtc

References

Related Reading

Which AI Coding Tool Should You Choose? 2026 Comprehensive Comparison Guide

Introduction: Why I Recommend ChatGPT Pro / Gemini Advanced Pro

1. Major International AI Coding Tools Breakdown

1. Cursor - The AI-Native IDE Representative

2. Claude Code - Terminal AI Agent King (But With Ban Risks⚠️)

3. GitHub Copilot - The Value King

4. Windsurf - Cursor's Direct Competitor

5. Zenmux - Unified AI API Gateway

6. OpenCode - The Open-Source Freedom Champion

7. OpenAI Codex CLI - OpenAI's Terminal Solution

2. Why ChatGPT Pro / Gemini Advanced Pro Is the Best Solution

Core Advantage: Unlimited Usage, Predictable Costs

ChatGPT Pro ($200/month)

Gemini Advanced Pro ($20-30/month)

Why More Practical Than Professional Coding Tools?

1. Cost Certainty vs Uncertainty

2. Versatility vs Specialization

3. Flexibility vs Lock-in

Best Practice: Pro Membership + Free/Low-Cost Tool Combinations

3. Hidden Fees Revealed: What Seems Cheap Might Be Expensive

1. Overage Charges

2. Model Selection Traps

3. Ban Risk Costs (Claude Code Specific)

4. Team Cost Amplification

5. Implementation and Governance Costs

6. Data Privacy Costs

4. Scenario-Based Recommendations: Pro Membership + Tool Combination Strategy

Scenario 1: Limited Budget, Pursuing Highest Value ⭐ Recommended

Scenario 2: Professional Developer, Need Strongest Capabilities ⭐ Recommended

Scenario 3: All-Round Developer, Pursuing Invincible Combination

Scenario 4: Team/Enterprise, Compliance and Privacy Needed

Scenario 5: Just Want "Unlimited and Worry-Free" General AI

5. 2026 Market Trends

1. Pricing Model Fragmentation

2. $20/Month Becomes Standard Tier

3. Heavy User Costs Converge

4. Free Tiers Are Genuinely Usable

5. Tool Combination Becomes Standard

6. Ban Risk Becomes Important Consideration

6. Final Recommendations: Pro Membership Is Core, Tools Are Supplementary

Conclusion: Pro Membership Is Core, Tools Are Supplementary

Appendix: Quick Comparison Table