AttractivePenguin

Posted on Mar 16

CodeSpeak: When English Isn't Precise Enough for AI

#ai #llm #kotlin #devtools

CodeSpeak: When English Isn't Precise Enough for AI

The Problem: Natural Language Is Ambiguous by Design

If you've spent any time working with large language models, you've experienced the frustration. You craft what seems like a perfectly clear prompt, send it to the model, and get back something completely different from what you wanted. You refine, rephrase, add examples—and still the model misinterprets your intent.

This isn't a bug. It's a fundamental limitation of natural language. English, like all human languages, is inherently ambiguous. Context matters. Nuance matters. The same sentence can mean different things to different people—or even to the same person at different times.

When humans communicate, we fill in gaps with shared context, cultural understanding, and real-time clarification. LLMs don't have that luxury. They're interpreting your words based on statistical patterns learned from billions of text examples—and sometimes those patterns lead them astray.

The result? Developers spend hours iterating on prompts, debugging unexpected outputs, and building elaborate workaround systems just to get consistent behavior from their AI tools.

Andrey Breslav, the creator of Kotlin, recognized this problem. His answer is CodeSpeak—a formal specification language designed specifically for communicating with LLMs.

What is CodeSpeak?

CodeSpeak is a domain-specific language that replaces natural language prompts with formal specifications. Instead of describing what you want in English, you write it in a structured, unambiguous format that LLMs can parse precisely.

The core idea is simple but powerful: if we use formal languages to communicate with computers when precision matters, why wouldn't we do the same for AI?

When you write code, you don't describe algorithms in English—you use programming languages with precise semantics. CodeSpeak applies that same principle to LLM interaction. It's not about writing code for the LLM to execute; it's about writing specifications that the LLM can understand unambiguously.

The language is still in active development, but the foundational concepts are already taking shape. CodeSpeak specifications are:

Structured: Clear hierarchies and relationships
Typed: Explicit types for inputs and outputs
Composable: Specifications can reference other specifications
Validatable: Tools can check specification correctness

Code Examples: Natural Language vs. CodeSpeak

Let's see the difference in action.

Example 1: API Endpoint Specification

Natural Language Prompt:

Create a REST API endpoint for user registration. It should accept
a username, email, and password, validate them, and return a success
message or error details.

This prompt leaves many questions unanswered:

What validation rules? How long can usernames be? What characters are allowed?
What password requirements? Minimum length? Special characters?
What format should the response take? JSON? Status codes?
Should passwords be hashed? How?

CodeSpeak Specification:

spec UserRegistration {
  endpoint: POST /api/users/register

  input {
    username: string {
      length: 3..32
      pattern: [a-zA-Z0-9_-]+
      unique: true
    }
    email: string {
      format: email
      unique: true
    }
    password: string {
      length: 12..128
      contains: [uppercase, lowercase, digit]
    }
  }

  output {
    success: {
      status: 201
      body: { userId: UUID, message: string }
    }
    error: {
      status: 400 | 409
      body: { field: string, reason: string }
    }
  }

  behavior {
    hash(password) using bcrypt
    create(User) in database
    send(WelcomeEmail) to email
  }
}

The CodeSpeak version is longer, but it's complete. An LLM receiving this specification knows exactly what to build—no guessing, no back-and-forth.

Example 2: Code Generation Task

Natural Language Prompt:

Write a function that finds the longest palindromic substring in a 
given string. Make it efficient.

What does "efficient" mean? O(n²)? O(n)? Different LLMs might interpret this differently, or prioritize different aspects.

CodeSpeak Specification:

spec LongestPalindromicSubstring {
  function: findLongestPalindrome(s: string) -> string

  constraints {
    timeComplexity: O(n)
    spaceComplexity: O(n)
    input: ASCII | Unicode
    output: longest substring that reads same forwards and backwards
  }

  examples {
    "babad" -> "bab" | "aba"
    "cbbd" -> "bb"
    "" -> ""
  }

  behavior {
    return first occurrence if multiple exist
    case sensitive
  }
}

Example 3: Data Transformation

Natural Language Prompt:

Convert this CSV to JSON format

This is dangerously underspecified. JSON structure could be anything.

CodeSpeak Specification:

spec CsvToJson {
  input: CSV with headers
  output: JSON array

  schema {
    each_row -> {
      [header1]: value as string,
      [header2]: value as number if numeric,
      [header3]: value as boolean if "true"/"false"
    }
  }

  options {
    null_values: ["", "NULL", "N/A"]
    trim_whitespace: true
  }
}

Real-World Scenarios

Scenario 1: AI Code Review Assistants

When integrating AI into code review pipelines, consistency is critical. You want the same code to get similar feedback every time—not wildly different interpretations of "check for bugs."

CodeSpeak enables:

spec CodeReview {
  target: PullRequest

  checks {
    security: scan for OWASP Top 10 patterns
    performance: flag O(n²) or worse algorithms
    style: enforce team conventions
    coverage: require 80% for new code
  }

  output: {
    severity: critical | warning | suggestion
    location: file:line
    fix: optional code suggestion
  }
}

Every review follows the same rules. No deviation. No creative interpretations.

Scenario 2: Automated Documentation Generation

Documentation needs consistent structure across your codebase. Natural language prompts drift over time.

CodeSpeak maintains consistency:

spec GenerateDocs {
  input: TypeScript function

  output {
    summary: one-line description
    parameters: table of { name, type, description }
    returns: { type, description }
    throws: list of possible errors
    examples: 1-2 usage examples
  }

  style {
    tone: technical but friendly
    format: JSDoc
  }
}

Scenario 3: Agent System Prompts

For autonomous AI agents, the system prompt is everything. Ambiguity leads to unpredictable behavior.

CodeSpeak can define agent behavior:

spec DataAnalysisAgent {
  role: Data Analysis Assistant

  capabilities {
    - read CSV, JSON, Excel files
    - perform statistical analysis
    - create visualizations
    - explain findings in plain English
  }

  constraints {
    - never modify source files
    - always show methodology
    - flag uncertainty in conclusions
  }

  communication {
    format: markdown with code blocks
    tone: objective, data-driven
    ask_clarification: when user intent unclear
  }
}

FAQ

Q: Do I need to learn another programming language?

A: CodeSpeak is designed to be intuitive for developers. If you know YAML, JSON Schema, or any typed language, you'll recognize the patterns. The learning curve is intentional—it trades some upfront effort for massive gains in reliability.

Q: Will this work with all LLMs?

A: CodeSpeak is designed for LLMs with strong pattern-matching capabilities. GPT-4, Claude, and similar models handle it well. Smaller models may struggle with complex specifications. The language is still evolving, and tooling for validation is in development.

Q: Is this production-ready?

A: As of early 2026, CodeSpeak is in active development. The concepts are solid, but expect changes. It's worth experimenting with now, but production use should wait for stable tooling.

Q: Can I mix CodeSpeak with natural language?

A: Yes! CodeSpeak specifications can include natural language comments and descriptions. Use formal specs where precision matters, natural language where flexibility is acceptable.

Q: What about prompt injection?

A: Formal specifications make prompt injection harder but not impossible. CodeSpeak's structured format means unexpected inputs are more likely to fail validation. But security still requires careful input handling.

Conclusion

CodeSpeak represents a philosophical shift in how we think about AI interaction. We've been treating LLMs like they're just another conversation partner—smart humans who need clear English. But they're not humans. They're pattern matchers working on statistical inference.

The more precisely we can specify what we want, the better they perform. CodeSpeak gives us the vocabulary to do that.

Will it replace all natural language prompting? Probably not. For quick questions and casual use, English works fine. But for production systems, automated pipelines, and any context where reliability matters—formal specifications are the future.

Andrey Breslav transformed Java development with Kotlin, bringing modern language design to the JVM. CodeSpeak might do the same for AI interaction—turning an art into an engineering discipline.

The project is open source and actively developed. Check it out at codespeak.dev and see if formal specifications might be the answer to your prompt engineering frustrations.

Have you tried CodeSpeak or similar formal prompting approaches? Share your experiences in the comments!

DEV Community

CodeSpeak: When English Isn't Precise Enough for AI

CodeSpeak: When English Isn't Precise Enough for AI

The Problem: Natural Language Is Ambiguous by Design

What is CodeSpeak?

Code Examples: Natural Language vs. CodeSpeak

Example 1: API Endpoint Specification

Example 2: Code Generation Task

Example 3: Data Transformation

Real-World Scenarios

Scenario 1: AI Code Review Assistants

Scenario 2: Automated Documentation Generation

Scenario 3: Agent System Prompts

FAQ

Conclusion

Top comments (0)