Sathya Prakash MC

Posted on Jan 20

Building an AI-Enhanced API Validator: A Weekend Developer's Journey

#openapi #api #ai #opensource

A practical exploration of creating automated validation tools for OpenAPI specifications, with lessons learned from real-world implementation

High-level validation flow for an OpenAPI linter, showing the core validation pipeline with optional AI-assisted suggestions.

Introduction: When Your Own Code Betrays You

While documenting three microservices built over two weeks, I discovered a troubling pattern: userId, user_id, and UserID appeared across my APIs—three different conventions for the same concept.

If one developer couldn't maintain consistency in a fortnight, what hope existed for distributed teams? This question led to building an open-source API linter that validates OpenAPI specifications and suggests corrections using AI assistance.

This article examines the technical decisions, implementation challenges, and lessons learned from creating this developer tool in the open-source ecosystem.

The Landscape: Existing Tools and Their Limitations

Before building anything new, I surveyed the existing ecosystem of API validation tools.

Spectral, from Stoplight, offers comprehensive rule-based validation. It's powerful and extensible, with a mature plugin system designed for teams needing fine-grained control. The default ruleset contains over 100 rules, ideal for organizations requiring comprehensive coverage.

Redocly provides excellent validation with a polished user interface. The commercial offering includes collaborative features and hosted documentation, targeting enterprise customers with full API lifecycle management needs.

SwaggerHub solves the entire API lifecycle: design, documentation, mocking, and validation. This comprehensive approach works well for organizations ready to adopt a complete platform.

These tools excel at solving API governance at enterprise scale. However, a gap remained for a different use case: individuals and small teams who want immediate, opinionated feedback without onboarding overhead. Not better or worse—just optimized for rapid feedback cycles and minimal configuration for smaller projects.

Design Philosophy: Simplicity as a Feature

The core design emerged from a simple principle: make the common case trivial.

For basic usage, a single command without configuration:

node bin/api-linter.js your-api.yaml

This simplicity masks careful architectural decisions made to support future complexity without imposing it upfront.

Scope Boundaries

This tool intentionally does not attempt to solve:

Full API governance across organizational boundaries
Policy enforcement with approval workflows and compliance tracking
Lifecycle management for API versioning, deprecation, and migration
Team collaboration features like shared workspaces or real-time editing

These are valuable capabilities for enterprises but add complexity unsuitable for the target use case: fast, focused validation for individual developers and small teams.

Modular Validation Architecture

Each validation concern became an independent module:

class DescriptionValidator {
  constructor(config = {}) {
    this.minLength = config.minLength || 10;
  }

  validate(spec) {
    const issues = [];

    for (const [name, schema] of Object.entries(spec.components?.schemas || {})) {
      if (!schema.description) {
        issues.push({
          severity: 'error',
          message: `Schema '${name}' lacks description`,
          path: `components.schemas.${name}`,
          rule: 'description-required'
        });
      }
    }

    return issues;
  }
}

This modularity enables:

Selective execution: Run only relevant validators for specific projects
Custom validators: Teams can add domain-specific rules
Testability: Each validator tests in isolation
Maintainability: Clear boundaries between different validation concerns

The Validator Set

Eleven validators emerged from analyzing common API design issues:

Description Validator: Ensures documentation exists and meets quality thresholds
Common Fields Validator: Enforces consistency for standard fields (IDs, timestamps, amounts)
Error Response Validator: Validates error format consistency
Schema Validator: Checks types, formats, and required fields
Path Validator: Enforces RESTful conventions
Parameter Validator: Validates query and path parameters
Response Validator: Checks HTTP status codes and content types
Security Validator: Ensures authentication requirements are documented
Reference Validator: Detects broken $ref links
Enum Validator: Validates enumeration definitions
Pagination Validator: Ensures consistent pagination patterns

Each validator addresses a specific category of issues observed in real-world API specifications.

Implementation: From Concept to Code

Phase One: Core Validation Engine

The foundation required three components: parsing, validation orchestration, and reporting.

Parsing leverages the battle-tested js-yaml library:

const yaml = require('js-yaml');
const fs = require('fs');

function loadSpec(filePath) {
  try {
    const content = fs.readFileSync(filePath, 'utf8');
    return yaml.load(content);
  } catch (error) {
    throw new Error(`Failed to parse ${filePath}: ${error.message}`);
  }
}

Orchestration runs validators in parallel for performance:

async function validateSpec(spec, validators) {
  const validationPromises = validators.map(validator => 
    Promise.resolve(validator.validate(spec))
  );

  const results = await Promise.all(validationPromises);
  return results.flat();
}

Reporting formats results for human consumption:

function formatIssues(issues) {
  const grouped = groupBy(issues, 'severity');

  console.log(`\nERRORS: ${grouped.error?.length || 0}`);
  console.log(`WARNINGS: ${grouped.warning?.length || 0}`);

  for (const issue of issues) {
    console.log(`\n${issue.severity.toUpperCase()}: ${issue.message}`);
    console.log(`Location: ${issue.path}`);
    if (issue.suggestion) {
      console.log(`Suggestion: ${issue.suggestion}`);
    }
  }
}

Phase Two: AI Integration

The AI component addresses a fundamental limitation of rule-based validation: context. A rule can identify that a field name violates conventions, but understanding the correct alternative requires semantic comprehension.

Google's Gemini API provides this capability:

async function enhanceWithAI(issue, spec) {
  const context = extractRelevantContext(spec, issue.path);

  const prompt = `
You are an API design expert. A validation issue was found:

Issue: ${issue.message}
Location: ${issue.path}
Current value: ${JSON.stringify(context, null, 2)}

Provide:
1. Why this violates best practices
2. The correct value with explanation
3. Code example showing the fix

Be concise and specific.
`;

  const result = await gemini.generateContent(prompt);
  return {
    ...issue,
    aiSuggestion: result.text()
  };
}

Critical design decision: AI remains optional. The tool functions completely offline, with AI as an enhancement rather than a requirement. This respects:

Privacy concerns: Not all API specifications are public. For organizations using Gemini Pro with enterprise agreements, data remains within organizational boundaries and is not used for model training, ensuring confidential specifications stay protected.
Cost sensitivity: Not all users have API budgets for external services
Reliability needs: Network failures should not break core validation functionality

AI Guardrails: Several technical constraints ensure reliability:

AI suggestions are advisory text only—never modify specifications automatically
AI enhancement is non-blocking—failures do not fail builds or validation
Prompts include context size limits to prevent API quota exhaustion
AI output is clearly labeled as suggestions requiring human review
Rate limiting prevents accidental cost escalation on large specification sets

For enhanced capabilities, the architecture supports integration with Retrieval-Augmented Generation (RAG) patterns. Organizations can provide their own documentation and API design standards as context, allowing the AI to make suggestions specifically aligned with internal conventions rather than generic best practices.

Phase Three: Pattern Discovery

Manual configuration scales poorly. Better: learn from existing code.

The discovery tool scans existing API specifications to infer patterns:

async function discoverStandards(apiDirectory) {
  const specs = await loadAllSpecs(apiDirectory);
  const patterns = {
    fieldNaming: analyzeFieldNaming(specs),
    errorFormat: analyzeErrorFormats(specs),
    commonFields: identifyCommonFields(specs),
    descriptionStyle: analyzeDescriptions(specs)
  };

  return generateConfig(patterns);
}

This auto-generated configuration captures team-specific conventions without requiring explicit documentation. The linter then enforces consistency with actual practice rather than theoretical ideals.

User Experience: Multiple Interfaces for Different Needs

Different users have different preferences for interaction.

Command-Line Interface

Developers working in terminals need fast, scriptable tools:

# Basic validation
node bin/api-linter.js api.yaml

# JSON output for CI/CD
node bin/api-linter.js api.yaml --format=json

# Fail build on errors
node bin/api-linter.js api.yaml --fail-on-error

# With AI enhancements
GEMINI_API_KEY=xxx node bin/api-linter.js api.yaml --ai

Web Interface

Non-technical stakeholders benefit from graphical interfaces:

// Simple Express.js server
app.post('/validate', upload.single('spec'), async (req, res) => {
  const spec = yaml.load(req.file.buffer.toString());
  const issues = await validateSpec(spec, validators);
  res.json({ issues });
});

The web UI provides:

Drag-and-drop file upload
Color-coded issue severity
Filterable results table
Downloadable reports

Programmatic API

Other tools can integrate validation:

const { validateSpec, validators } = require('api-linter');

const issues = await validateSpec(mySpec, validators);
if (issues.some(i => i.severity === 'error')) {
  throw new Error('API validation failed');
}

Real-World Application

The tool serves several practical use cases that emerged during development and early adoption.

Startup Teams

Early-stage companies building their first APIs benefit from automated guidance. The linter provides consistent feedback on common mistakes before code review. Teams of 2-5 developers report reduced review time and fewer design inconsistencies.

Solo Developers

Individual developers maintaining multiple services face consistency challenges. The discovery tool learns from existing APIs and ensures new additions match established patterns, even weeks or months later.

Open Source Projects

Contributors joining open-source projects need clear guidelines. The linter provides immediate feedback on whether contributions match project standards, reducing maintainer burden and contributor frustration.

Learning and Education

Junior developers learning API design receive concrete, actionable feedback. Rather than abstract principles, they see specific issues in their code with suggestions for improvement.

Performance Considerations

Early versions suffered from poor performance. Initial implementation took 4-5 seconds to validate a typical API specification, which is unacceptable for developer tools that run frequently.

Optimization efforts focused on three areas:

Lazy Loading: Validators load on demand rather than upfront:

const validators = {
  description: () => require('./validators/description'),
  naming: () => require('./validators/naming'),
  // ... etc
};

// Only load what's needed
const activeValidators = config.enabledValidators.map(
  name => validators[name]()
);

Parallel Execution: Independent validators run concurrently (I/O-bound operations benefit from async execution):

const results = await Promise.all(
  validators.map(v => v.validate(spec))
);

Caching: Parsed YAML caches for multiple validator passes (per-run cache, not persistent):

const specCache = new Map();

function getSpec(filePath) {
  if (!specCache.has(filePath)) {
    specCache.set(filePath, loadSpec(filePath));
  }
  return specCache.get(filePath);
}

These changes reduced validation time to 50-80ms for typical specifications (tested on MacBook Pro M1, Node.js 18.x, ~200-line OpenAPI specs), which is fast enough that developers don't notice the delay.

Lessons from the Field

Several months of usage revealed insights not apparent during initial development.

Error Message Quality Matters More Than Detection

Writing code to detect issues is straightforward. Writing messages that help developers fix those issues requires care and iteration. Poor messages:

Error: Invalid description

Better messages:

Error: Description too short (5 characters, minimum 10)
Location: components.schemas.User.description
Current: "User"
Suggestion: "User account information including authentication credentials and profile data"

The difference: specific location, current value, clear expectation, concrete example.

Configuration Is a Double-Edged Sword

Too little configuration forces users to accept defaults that don't fit. Too much configuration overwhelms with options nobody understands.

The solution: sensible defaults with progressive disclosure. Zero config works for common cases. Advanced users can opt into complexity when needed.

AI Augments, Doesn't Replace, Rules

AI suggestions provide value but can't substitute for deterministic validation. AI might hallucinate incorrect suggestions. Rules provide reliable, repeatable validation. The combination works better than either alone.

Speed Is a Feature

Developers won't use slow tools. A linter that takes 10 seconds loses to manual review. A linter that takes 50 milliseconds becomes part of every workflow. Performance isn't optional for developer tooling.

Open Source Considerations

Releasing this as open source under the MIT license enabled collaboration while keeping complexity manageable.

Documentation by Example: Rather than comprehensive reference documentation, working examples demonstrate usage. Developers copy, modify, and learn by doing.

Conservative Dependencies: The project depends on only two libraries: js-yaml for parsing and express for the web server. Fewer dependencies mean less maintenance burden and easier understanding.

Plugin Architecture: Custom validators extend functionality without forking:

// custom-validator.js
module.exports = {
  name: 'CustomValidator',
  validate(spec) {
    // Custom validation logic
    return issues;
  }
};

// Usage
node bin/api-linter.js api.yaml --plugins ./custom-validator.js

Contribution Guidelines: Clear guidelines for adding validators help contributors understand expectations without requiring extensive back-and-forth.

Future Directions

Several enhancements would increase utility:

IDE Integration: A VS Code extension providing real-time validation as developers edit OpenAPI specifications. Immediate feedback catches issues before saving files.

Expanded AI Capabilities: Beyond field-level suggestions, full API design review. Questions like "Does this API follow RESTful principles?" or "How could this schema be simplified?" benefit from AI analysis.

Additional Schema Formats: Support for GraphQL schemas, AsyncAPI for event-driven architectures, and gRPC proto files. The validation patterns transfer across formats; primarily the parsing logic needs adaptation.

Collaborative Features: Shared standards repositories where teams publish and consume validation configurations. Learn from the broader community's patterns.

However, the current version solves the core problem. Feature additions should enhance, not complicate. The tool remains useful even without further development—a key measure of success for developer tools.

Conclusion

Building this API linter reinforced several principles:

Solve real problems first. Academic correctness matters less than practical utility. The tool validates what developers actually struggle with, not every possible API design consideration.

Start simple, iterate based on usage. The first version had three validators and a CLI. Each addition came from observed needs, not speculation about potential use cases.

Make the right thing easy. Developers choose the path of least resistance. When the linter is easier than manual review, it gets used. When it's harder, it gets ignored.

Open source enables unexpected uses. Publishing the code led to use cases I never considered—educational settings, API governance in enterprises, integration with custom workflows.

The tool isn't revolutionary. It combines existing techniques such as rule-based validation, AI assistance, and pattern discovery in a package optimized for accessibility. Sometimes that's exactly what the ecosystem needs: not groundbreaking innovation, but solid execution that removes barriers to entry.

For teams building APIs, consistency matters. This tool provides one approach to achieving it without significant process overhead. The code is available, the license is permissive, and the architecture supports customization.

If it solves your problem, use it. If it almost solves your problem, modify it. If it doesn't solve your problem, perhaps it illustrates patterns useful for building what you actually need.

That's the beauty of open source: everyone gets to decide for themselves.

About the Author

Sathya Prakash MC is a software engineer passionate about API design, microservices architecture, and building practical developer tools. He creates open-source solutions to solve real-world development challenges, with a focus on automation and improving developer experience.

Contact: itsprakash84@gmail.com

GitHub: https://github.com/itsprakash84

LinkedIn: https://www.linkedin.com/in/sathyaprakash1260/

Project Repository: https://github.com/itsprakash84/api-linter

Resources

Project GitHub: https://github.com/itsprakash84/api-linter
Live Demo: Visit the repository for web interface setup instructions
OpenAPI Specification: https://spec.openapis.org/oas/latest.html
Google Gemini API: https://ai.google.dev/

Keywords: API Validation, OpenAPI, Swagger, Developer Tools, AI Integration, Open Source, Node.js, Microservices, Software Engineering, API Design

DEV Community