DmitryGanin

Posted on May 23

Zero-Cost AI in VS Code

#programming #ai #javascript

Zero-Cost AI: Accessing Premium Models in VS Code Without API Keys

How I built a VS Code extension that gives you free access to Qwen-Max and DeepSeek models using only your existing web account — no billing, no tokens, no limits.

🎯 The Problem with Modern AI Tools

The state of AI development has become expensive:

API keys needed for every provider
Per-token pricing that adds up quickly
Rate limits blocking your workflow
Multiple subscriptions to different services

Premium AI services charge significant monthly fees:

ChatGPT Plus: $20/month
Claude Pro: $20/month
Gemini Advanced: $20/month

That's over $600/year for basic access! And you still need separate accounts for each service.

What if there was another way?

💡 The Solution: Browser-Based Authentication

I developed AI Free VSCode — an open-source extension that leverages your existing free tier accounts from AI providers through browser automation.

Key Innovation

Instead of requiring API keys (which often have strict rate limits), the extension:

Uses Playwright to automate a real Chromium browser session
Stores authentication cookies locally
Makes requests through the official web APIs
Gives you full access to the same free tier available on their websites

Result?

✅ Zero cost - uses existing free accounts

✅ No API keys - just sign in once

✅ Higher limits - same as browsing the website

✅ Native integration - works directly in Copilot Chat

✅ Agent mode - full tool calling support

🏗 Architecture Overview

Extension Structure

ai-free-vscode/
├── src/
│   ├── extension.mjs          # Entry point & commands
│   ├── lmProvider.mjs         # Unified LM provider interface
│   ├── deepseek/
│   │   ├── auth.mjs           # Browser login with Playwright
│   │   ├── client.mjs         # API client implementation
│   │   ├── provider.mjs       # Model logic & session management
│   │   └── config.mjs         # Configuration constants
│   ├── qwen/
│   │   ├── auth.mjs           # Qwen authentication
│   │   ├── client.mjs         # Qwen API client
│   │   └── provider.mjs       # Qwen model implementation
│   ├── utils/
│   │   ├── logger.mjs         # Debug logging
│   │   ├── rateLimiter.mjs    # Rate limiting protection
│   │   ├── responseValidator.mjs
│   │   └── tokenValidator.mjs
│   └── promptUtils.mjs        # Message formatting
├── package.json               # Extension manifest
└── README.md                  # Documentation

Core Components

1. Authentication Flow

The extension registers commands for users to authenticate:

context.subscriptions.push(
  vscode.commands.registerCommand("deepseek.login", async () => {
    await clearProfileSession(); // Clear old session
    const result = await loginAndSaveAuth(); // New login via Playwright
    auth.cookieHeader = result.cookieHeader;
    auth.token = result.token;
  }),
);

Process:

Opens Chromium browser via Playwright
User signs into provider normally
Session cookies captured and stored locally
Cookies used for subsequent API requests

2. Unified Provider Interface

All models are unified under a single vendor namespace:

class AiFreeVscodeChatModelProvider {
  async provideLanguageModelChatResponse(
    model,
    messages,
    options,
    progress,
    token,
  ) {
    // Convert VS Code messages to API format
    const convertedMessages = convertMessages(messages);
    const tools = convertToolSchemas(options?.tools);
    const prompt = messagesToPrompt(convertedMessages, tools);

    // Route to appropriate provider
    switch (model.family) {
      case "deepseek":
        await deepseekComplete({ modelId, prompt, auth, onText, signal });
        break;
      case "qwen":
        await qwenComplete({
          modelId,
          prompt,
          auth,
          onText,
          onThinking,
          signal,
        });
        break;
    }
  }
}

Process:

Routes VS Code chat requests to appropriate provider
Handles both DeepSeek and Qwen models
Converts messages to API format
Manages streaming responses

3. Smart Session Management

Maintains conversation continuity with session caching:

const sessionIdCache = new Map();

async function runComplete({
  modelId,
  prompt,
  auth,
  threadKey,
  messagesCount,
}) {
  // Start fresh session for first message in thread
  if (messagesCount === 1) {
    sessionIdCache.delete(threadKey);
  }

  // Try cached session first (for conversation continuity)
  const cachedSessionId = sessionIdCache.get(threadKey);
  if (cachedSessionId) {
    const ok = await attempt(cachedSessionId);
    if (ok) return; // Success!
  }

  // Retry with new session
  const sessionId = await client.createSession({ signal });
  sessionIdCache.set(threadKey, sessionId);
  await attempt(sessionId);
}

🔧 Installation & Setup

Step 1: Install the Extension

Download the latest .vsix file from Releases and install via VS Code Extensions panel.

Or develop locally:

git clone https://github.com/AppsGanin/ai-free-vscode
cd ai-free-vscode
npm install  # installs dependencies + Playwright Chromium

Press F5 to launch in Extension Development Host.

Step 2: Sign In to Provider

Open Command Palette (Cmd+Shift+P / Ctrl+Shift+P)
Run "AI Free VSCode: DeepSeek: Sign In (Playwright)"
A browser window opens automatically
Log in to your account normally
Window closes when session is saved

Repeat for Qwen or other supported providers.

Step 3: Start Chatting

Open Copilot Chat panel (⌘+L)
Select your preferred model from dropdown
Start asking questions!

🚀 Supported Models

Model	ID	Use Case
DeepSeek V4	`deepseek-default`	General purpose
DeepSeek V4 Expert	`deepseek-expert`	Complex reasoning
Qwen2.5-Max	`qwen-max`	Powerful tasks
Qwen3.6-Plus	`qwen-plus`	Long documents
Qwen3-Max	`qwen3-max`	Flagship quality
Qwen3-Coder	`qwen-coder`	Code generation
Qwen3.5-Flash	`qwen-flash`	Fastest responses

All models support tool calling for Agent mode operations like:

File reading/writing
Terminal execution
Multi-step debugging
Code refactoring

🛠 Technical Deep Dive

How Messages Are Processed

1. Message Conversion

VS Code messages are converted to API-compatible format:

function convertMessages(messages) {
  return messages
    .map((msg) => {
      const role = msg.role === "assistant" ? "assistant" : "user";

      // Handle text content
      const content = msg.content
        .map((part) =>
          part instanceof LanguageModelTextPart ? part.value : "",
        )
        .join("");

      // Extract tool calls from assistant
      const toolCalls = msg.content
        .filter((p) => p instanceof LanguageModelToolCallPart)
        .map((p) => ({
          id: p.callId,
          type: "function",
          function: { name: p.name, arguments: JSON.stringify(p.input) },
        }));

      // Generate separate "tool" messages for results
      const toolResults = msg.content
        .filter((p) => p instanceof LanguageModelToolResultPart)
        .map((p) => ({
          role: "tool",
          tool_call_id: p.callId,
          content: p.content.value,
        }));

      return [
        ...toolResults,
        { role, content, tool_calls: toolCalls.length ? toolCalls : undefined },
      ];
    })
    .flat();
}

2. Tool Call Detection

The system detects markdown fences indicating tool calls:

const TOOL_FENCES = ["`\`\`tool_call", "\ntool_call\n{", "tool_call\n{"];

function findFence(str) {
  let best = -1;
  for (const fence of TOOL_FENCES) {
    const idx = str.indexOf(fence);
    if (idx !== -1 && (best === -1 || idx < best)) best = idx;
  }
  return best;
}

// Stream processing
streamBuf += text;
const idx = findFence(streamBuf);
if (idx !== -1) {
  // Emit text before fence, suppress tool call block
  flushStream(streamBuf.slice(0, idx));
  streamBuf = "";
  inToolCall = true;
}

This prevents raw tool call blocks from appearing in the chat UI while still executing them properly.

3. Thinking Mode Support

For models with explicit reasoning phases:

let thinkingStarted = false;
let thinkingText = "";

const onThinking = async (text) => {
  thinkingText += text;
  thinkingStarted = true;
};

// When content starts, emit thinking as collapsible block
if (thinkingStarted) {
  progress.report(new LanguageModelThinkingPart(thinkingText, "thinking-0"));
}

VS Code displays this as a native collapsible "💭 Thinking" section above responses.

🔐 Security & Privacy

What Happens to Your Data?

✅ Cookies stored locally - Only your machine, encrypted by OS
✅ No cloud storage - We never transmit your credentials
✅ Session isolation - Each provider maintains separate sessions
✅ No telemetry - No usage statistics sent anywhere

Error Handling

try {
  await client.complete({ ... });
} catch (e) {
  // Graceful handling of various errors
  if (e.isNotSignedIn) {
    showErrorMessage("Please sign in first");
  } else if (e.isAuthError) {
    // Cookie expired - force re-login
    clearProfileSession();
    throw e;
  } else if (isBizError(e)) {
    // Business logic error with formatted message
    progress.report(new TextPart(formatBizError(e.bizCode, e.bizMsg)));
  }
}

⚠️ Limitations & Caveats

Important Considerations

Terms of Service - Automating browser sessions may violate provider ToS
Account Risk - Your account could be restricted (use at your own risk)
Stability - Providers can change APIs without notice
Single Session - Only one active user session at a time
No Enterprise Support - Not suitable for corporate compliance requirements

Mitigation Strategies

Use separate accounts from main email
Don't abuse the service (reasonable usage only)
Keep extension updated for API changes
Maintain backups of important code/settings

🚀 Real-World Use Cases

Scenario 1: Code Review Assistant

# Ask about potential bugs
User: "Review this Python function for memory leaks:"
[User pastes code]

Assistant: Analyzes code structure, identifies resource leaks,
suggests fixes with explanations

Scenario 2: Database Query Optimization

-- Paste slow query
EXPLAIN SELECT * FROM users WHERE created_at > NOW() - INTERVAL '7 days';

Assistant: Suggests indexing strategies, query rewriting,
and alternative approaches

Scenario 3: Full Stack Debugging

Identify error in terminal
Ask assistant to analyze stack trace
Get root cause explanation
Receive fix suggestion with code example
Apply fix directly in editor

🤝 Contributing

This is an open-source hobby project built by enthusiasts, for enthusiasts.

Ways to contribute:

Fix bugs - See open issues
Add new models - Implement additional AI providers
Improve docs - Clarify setup instructions
Enhance UX - Better error messages, UI improvements
Write tests - Increase coverage for edge cases

Getting started:

git clone https://github.com/AppsGanin/ai-free-vscode
cd ai-free-vscode
npm install
# Edit code, press F5 to test

Contributions welcome! PRs are always appreciated.

📝 Legal Disclaimer

This extension is unofficial and not affiliated with any AI provider.

Use at your own risk - Automating web sessions may violate ToS

No guarantees - May stop working if providers change APIs

No liability - Authors not responsible for consequences

Always review Terms of Service before use.

🎯 Conclusion

AI Free VSCode demonstrates that you don't need expensive API keys or multiple subscriptions to access premium AI capabilities. By leveraging browser automation and existing free tiers, we've created a solution that:

💰 Costs nothing - literally $0 monthly subscription
🚀 Works instantly - one-time sign-in, perpetual access
🔒 Respects privacy - all data stays local
🛠️ Integrates seamlessly - native VS Code experience

Whether you're a student learning to code, a indie developer building your startup, or just someone who wants powerful AI tools without breaking the bank - this extension removes financial barriers and puts cutting-edge technology in your hands.

Ready to try it?

👉 Download the extension

⭐ Star the repo if it helps your workflow

📣 Share with fellow developers

Let's democratize AI access together! 🚀

DEV Community