Harsh Bavaskar

Posted on Apr 12

I Built a Programming Language That Thinks in English

#programming #css #vscode #ai

Every developer I know thinks in pseudocode first. You sketch the logic in your head — "for each user, if their subscription is expired, send a renewal email" — and then you translate that mental model into whatever language the project demands. Python, Java, TypeScript. Pick your poison.

The translation step is where friction lives. It is where junior developers get stuck, where domain experts give up, and where senior developers lose flow state. We have accepted this as an immutable law of computing: thought and execution speak different languages.

NatLang is my answer to that friction. Not a code completion tool. Not a chat assistant you paste code into. NatLang is a genuine transpilation engine that treats your pseudocode as the primary artifact — your source-of-truth — and produces idiomatic, production-ready code from it.

"Logic should have no syntax barrier between thought and execution."

What NatLang Actually Does

Let me be precise about this, because "AI code generator" is a crowded and often misleading category. Here is the fundamental contract NatLang offers:

Input: Your Pseudocode (.nl)

# You write this in your .nl file:

define a function called processPayment that takes userId, amount
  if amount is less than zero
    return error "invalid amount"
  call validateUser with userId and store in user
  if user is empty
    return error "user not found"
  call chargeCard with user, amount
  return success

Output: Generated Python

def processPayment(userId, amount):
    if amount < 0:
        return "invalid amount"
    user = validateUser(userId)
    if not user:
        return "user not found"
    chargeCard(user, amount)
    return "success"

Press Ctrl+Shift+G. That is the entire workflow. The code streams live into your editor, token by token, while the sidebar shows you what is happening under the hood.

But the generation step is just the entry point. The real architecture is what happens next.

The Architecture: More Than a Wrapper

Most AI coding tools are thin API wrappers. NatLang is a layered system with six distinct subsystems — each solving a real problem that a naive implementation would ignore.

The extension sits on the left, capturing selection, streaming into the editor, and managing the sidebar dashboard. The Java backend runs separately, housing the agentic pipeline for deep analysis. The AI provider stack is intentionally abstract: the same streaming interface works for Ollama running locally or Claude 3.5 in the cloud.

The Multi-Provider AI Stack

One of NatLang's design decisions I am most proud of: every AI provider implements the same narrow interface. generate(system, user, onToken). That is the entire contract. This means the extension can stream tokens from any backend with the same callback pattern.

Ollama: Local, JSON streaming.
Anthropic: Cloud, SSE.
Google Gemini: Cloud, SSE.
Groq: Cloud, SSE.
OpenAI: Cloud, SSE.
Heuristic: Fallback, No network.

The heuristic provider is the quiet workhorse. It requires no API key, no network call, and no latency. When your preferred cloud model goes down mid-session, the Java backend automatically falls back to it and marks the response with a Generated-Fallback step marker so you know exactly what happened.

The Agentic Pipeline: A Full Reasoning System

The POST /api/process endpoint is deceptively simple from the outside. Inside the Java backend, it runs a full orchestration pipeline through CodeAgent, and the behavior changes depending on the action you request.

Provider Resolution: The agent selects the requested provider from a collection-driven in-memory registry. No hardcoded branches.
Code Generation: The prompt builder enforces code-only output: no markdown fences, no comments, real operators, complete Java class structure when required.
Complexity Analysis: The pipeline evaluates time and space complexity of the generated code — not as a gimmick, but as a driver for the next step.
Optimization Pass: If the action is optimize and the first pass still emits quadratic complexity, the agent runs a second optimization cycle automatically.
Explanation and Suggestions: Project-context suggestions are merged with analysis results. The decision log records the full route.

Enterprise-Grade Governance

AI generation is powerful but non-deterministic. For teams that need guaranteed, reproducible output, NatLang ships a deterministic compiler and a policy layer.

Deterministic Compiler: Parses a constrained pseudocode grammar into an AST and emits code for Python, JavaScript, and TypeScript — no model call, no network, no variance. Same input always produces the same output.
Policy-as-Code: A genuine policy engine that can block or warn on generated code patterns before they reach your codebase. Block dangerous patterns like eval(...) or subprocess.Popen, and mandate error handling blocks.
Transactional Edits: Every migration NatLang runs runs inside a transaction. Before any file changes, NatLang snapshots the originals. If anything goes wrong, a single rollback command restores the exact state you were in before.

How It Stacks Up

This is not a takedown. GitHub Copilot and Cursor are brilliant tools. But they are solving a different problem — and that distinction matters.

Capability	Copilot	Cursor	NatLang
Pseudocode as Source of Truth	No	No	Core Feature
Swappable AI Providers	OpenAI Only	Partial	Ollama + 4 Cloud
Works Fully Offline	No	Limited	Ollama Default
Complexity Analysis & Optimization	No	No	Full Pipeline
Policy-as-Code Enforcement	No	No	Built-in
Deterministic Output	No	No	Compiler Path
Atomic Rollback Transactions	No	No	Built-in
Price	$10/month	$20/month	Free (Open Source)

The differentiation is not about which AI model is smarter. It is about the mental model of development. NatLang gives you a new primitive — intent-first programming — and builds an entire infrastructure around protecting that intent all the way to production.

Why I Built This

As a second-year AI/ML student navigating advanced software architecture from my setup in Thane, I kept seeing the same problem everywhere. People who understood their systems perfectly in their heads could not express them efficiently in code.

I built NatLang to prove that the translation gap is a solvable engineering problem, not a fundamental limitation of human cognition. The governance features, the deterministic compiler, the agentic pipeline — those came from asking "what would this need to be to actually be used in production?" rather than "what is the simplest demo I can ship?"

The Roadmap & NatLang Pro

NatLang is free and open source today. But a Pro tier is on the horizon, built around the features that teams and power users need most:

Domain Vocabulary Packs: Pre-built lexicons for medical, financial, and logistics domains.
Notebook Mode: A REPL-style interface where pseudocode cells produce executable output cells.
Explainability Layer: Visual trace of every agentic pipeline decision.
Test Generation: Automatically generate unit tests from pseudocode intent before the implementation even exists.

Be the first to know when Pro ships: Join the Waitlist Here
If you believe in intent-first programming, the highest compliment you can give is contributing to the project or giving it a star.

HarshBavaskar / Natlang-Extension

NatLang is a VS Code AI transpilation engine that turns plain-English pseudocode into production-ready code across 30+ languages. Powered by Ollama, Claude 3.5 Sonnet, Gemini 2.0, and GPT-4o, it delivers real-time generation with Ctrl+Shift+G so you can write logic naturally and get real code instantly.

NatLang: The Intelligence-Driven Transpilation Engine

NatLang is a professional-grade Visual Studio Code extension designed to bridge the gap between natural language logic and production-ready implementation. By treating pseudocode as a first-class citizen, NatLang enables developers to architect systems using plain English while the underlying engine handles the complexities of syntax, idioms, and multi-language transpilation.

Technical Ecosystem (Stack)

NatLang is built on a high-performance, distributed architecture ensuring stability and speed.

Frontend/Extension: TypeScript with VS Code API and a Webview-driven Modern Dashboard.
Transpilation Engine: Interface-driven logic supporting SSE (Server-Sent Events) and JSON streaming.
Backend (Agentic AI): Java-based orchestration for deep code analysis and logical validation.
AI Core: Native multi-provider support
- Local: Ollama (default model: gemma3:4b).
- Cloud: Anthropic Claude 3.5, Google Gemini 1.5 Pro/Flash, Groq-hosted Llama, OpenAI GPT-4o.
Build & CI: esbuild, ESLint, GitHub Actions (for deployment).

Core Philosophy

Traditional AI assistants often generate code…

View on GitHub

Star NatLang on GitHub

Let me know what you think in the comments. What language would you want NatLang to compile to next?

Top comments (2)

Wes • Apr 13

Defaulting to Ollama for offline-first local generation is a practical call that most AI coding tools skip entirely. But the core philosophy: "no syntax barrier between thought and execution" - runs into a problem visible in your own examples. "call validateUser with userId and store in user" has specific grammar: swap "store in" for "save to" or drop "called" from a function definition and it probably breaks. That is syntax. It is just English-shaped syntax, and it is more verbose than what it generates - that pseudocode line is nearly twice as long as user = validateUser(userId).

Someone who can consistently phrase NatLang pseudocode correctly already has the mental model to write Python directly. The real gap between thought and code is not keywords and brackets, it is knowing which abstractions to reach for and how to structure logic around edge cases and the pseudocode layer does not help with that part. When you watch people actually use this, do they struggle with getting the phrasing right in ways that mirror the syntax problems you are trying to eliminate?

Harsh Bavaskar • Apr 15

Thank you for the detailed feedback. These are valid points and most of them are known issues I am actively working on fixing. To answer your question directly, yes, early users do struggle with phrasing in ways that mirror the exact syntax problems you described, and that is honestly one of the bigger things I am working on right now. The project is still in development and has a long way to go. I genuinely appreciate you taking the time to engage with it this deeply.