Sumit Kumar

Posted on Apr 23

AI-Powered API Gateway with Spring Boot: Turning Natural Language into Microservice Calls

#microservices #springboot #ai #llm

Microservices + AI: Natural Language API Gateway (Spring Boot)

Send plain English like "order 2 laptops" → get a real API call executed.

What This Project Does

You hit one endpoint with a sentence. The gateway figures out which microservice to call, extracts the payload, and executes the request — no JSON crafting needed on the client side.

🤔 Why Does This Project Exist? The Problem It Solves

Every REST API today has the same fundamental friction: the client must know the contract.

To create an order, you need to know:

The endpoint is POST /orders (not /order, not /createOrder)
The payload needs product and quantity (not item and count)
Content-Type must be application/json
The server returns 201, not 200 For machine-to-machine communication, this is fine. For humans, chatbots, voice interfaces, or any conversational UI — it's a wall.

This gateway removes that wall. It adds a semantic translation layer that accepts intent in plain English and handles all the contract mapping internally. Your microservices don't change. Your clients don't need to learn your API.

The 4 Services

Service	Port	Job
AI Gateway	8080	Entry point — takes text, runs AI, routes
User Service	8081	Handles users
Order Service	8082	Handles orders
Payment Service	8083	Handles payments

📨 The Request & Response Models

// RequestDTO.java — what comes in
@Data
public class RequestDTO {
    private String input;   // "Create a new user named Sumit with email sumit@test.com"
    private String apiKey;  // stubbed for future JWT-based auth
}

// ResponseDTO.java — what goes out
@Data
@Builder
public class ResponseDTO {
    private String mappedApi;              // "/users"
    private String method;                 // "POST"
    private Map<String, Object> payload;   // {"name":"Sumit","email":"sumit@test.com"}
    private String response;               // raw downstream service response
    private String validationStatus;       // APPROVED / REJECTED / UNMAPPED
    private String threatStatus;           // NONE / LOW / MEDIUM / HIGH
    private String aiProvider;             // "groq" or "ollama"
}

Request Lifecycle (5 Steps)

Input: "I want to order 2 laptops"

Validate — LLM call #1: Is this a threat? SQL injection? Gibberish?
Short-circuit — If threat is HIGH or input is invalid → reject immediately, no further calls
Map intent — LLM call #2: What API does this map to?
Route — /orders → http://localhost:8082/orders
Execute — HTTP call to Order Service, return response

LLM #1 output:

{ "valid": true, "reason": "Normal order request.", "threat": "NONE" }

LLM #2 output:

{ "endpoint": "/orders", "method": "POST", "payload": { "product": "laptop", "quantity": "2" } }

Final response:

{
  "mappedApi": "/orders",
  "method": "POST",
  "payload": { "product": "laptop", "quantity": "2" },
  "response": "{\"id\":\"ord_001\",\"status\":\"PENDING\"}",
  "validationStatus": "APPROVED",
  "threatStatus": "NONE",
  "aiProvider": "groq"
}

Project Structure

ai-gateway/
└── src/main/java/com/sumit/aigateway/
    ├── controller/GatewayController.java   ← single entry point
    ├── service/
    │   ├── AiService.java                  ← LLM calls (Groq / Ollama)
    │   ├── GatewayService.java             ← orchestration pipeline
    │   └── RoutingService.java             ← maps endpoint → service URL
    ├── client/DownstreamClient.java        ← calls downstream microservices
    ├── filter/RateLimitFilter.java         ← pre-request rate limiting
    └── model/
        ├── RequestDTO.java                 ← input model
        └── ResponseDTO.java                ← output model

Key Design Decisions

Two LLM calls, not one

Validation and mapping are kept separate. If validation fails, the second LLM call never happens — saves cost and latency.

`parseJson()` sanitizer

LLMs sometimes wrap JSON in markdown fences (json) even when told not to. This method strips them before parsing, preventing a Jackson crash.

Three response states

State	Meaning
`APPROVED`	All good — downstream service was called
`REJECTED`	Threat detected — blocked at validation
`UNMAPPED`	Valid input, but LLM couldn't match a known endpoint

UNMAPPED prevents a hallucinated endpoint like /unicorns from causing a 500 error.

`startsWith` in routing

/users matches both /users (create) and /users/123 (get by ID) — one rule covers all path variants.

`temperature: 0.1`

Near-deterministic LLM output. Creativity is the enemy when you need consistent, parseable JSON every time.

Groq vs Ollama — one config line switch

app:
  ai-provider: groq   # change to "ollama" for fully local inference

Factor	Groq	Ollama
Speed	~300–600ms	1–5s (hardware-dependent)
Cost	Free tier, pay-per-token	Free forever
Data privacy	Leaves your network	100% local
Best for	Production, low latency	Dev, private/air-gapped

Both run the same Llama model — the difference is purely infrastructure.

⚙️ GatewayService — The Orchestration Pipeline

@Slf4j
@Service
public class GatewayService {

    @Value("${app.ai-provider}") private String aiProvider;

    private final AiService aiService;
    private final RoutingService routingService;
    private final DownstreamClient downstreamClient;
    private final ObjectMapper objectMapper = new ObjectMapper();

    public ResponseDTO process(RequestDTO request) throws Exception {
        String userInput = request.getInput();
        log.info("Processing input: {}", userInput);

        // ── Step 1: Validate & detect threats ────────────────────────────
        String validationRaw = aiService.validateRequest(userInput);
        JsonNode validation  = parseJson(validationRaw);

        String threat   = validation.path("threat").asText("NONE");
        boolean isValid = validation.path("valid").asBoolean(true);

        if (!isValid || "HIGH".equals(threat)) {
            return ResponseDTO.builder()
                .validationStatus("REJECTED")
                .threatStatus(threat)
                .response("Rejected: " + validation.path("reason").asText())
                .aiProvider(aiProvider)
                .build();
        }

        // ── Step 2: Map natural language → API intent ─────────────────────
        String mappingRaw    = aiService.mapNaturalLanguageToApi(userInput);
        JsonNode mapping     = parseJson(mappingRaw);

        String endpoint      = mapping.path("endpoint").asText("").trim();
        String method        = mapping.path("method").asText("POST").trim();
        JsonNode payloadNode = mapping.path("payload");

        // ── Step 3: Guard against hallucinated endpoints ──────────────────
        if (endpoint.isEmpty() || !routingService.isKnownEndpoint(endpoint)) {
            return ResponseDTO.builder()
                .validationStatus("UNMAPPED")
                .threatStatus(threat)
                .response("Could not map to a known API. Try: " +
                          "'Create user', 'Place order', 'Process payment'")
                .aiProvider(aiProvider)
                .build();
        }

        // ── Step 4: Build payload map from dynamic JsonNode ───────────────
        Map<String, Object> payloadMap = new HashMap<>();
        if (payloadNode.isObject()) {
            payloadNode.fields().forEachRemaining(e ->
                payloadMap.put(e.getKey(), e.getValue().asText())
            );
        }

        // ── Step 5: Route and execute ─────────────────────────────────────
        String downstreamUrl   = routingService.resolveDownstreamUrl(endpoint);
        String serviceResponse = downstreamClient.call(method, downstreamUrl, payloadMap);

        return ResponseDTO.builder()
            .mappedApi(endpoint)
            .method(method)
            .payload(payloadMap)
            .response(serviceResponse)
            .validationStatus("APPROVED")
            .threatStatus(threat)
            .aiProvider(aiProvider)
            .build();
    }

    private JsonNode parseJson(String raw) throws Exception {
        String clean = raw.replaceAll("(?s)```

json|

```", "").trim();
        int start = clean.indexOf('{');
        int end   = clean.lastIndexOf('}');
        if (start >= 0 && end > start) clean = clean.substring(start, end + 1);
        return objectMapper.readTree(clean);
    }
}

Core Architecture Boundary

Natural Language Input
        │
   [AiService]          ← interprets intent only, no business logic
        │
   Structured JSON       ← AI's job ends here
        │
  [RoutingService]       ← pure Java, maps endpoint strings to URLs
        │
  [DownstreamClient]     ← HTTP caller
        │
  ┌─────────────────────────────────────────┐
  │  User :8081 │ Order :8082 │ Pay :8083   │
  │  No AI code │ No AI code  │ No AI code  │
  └─────────────────────────────────────────┘

AI decides WHAT to do. Microservices decide HOW to do it.

The 3 downstream services have zero AI code. The AI has zero knowledge of the services. The gateway is the only component that knows both exist — and keeps them cleanly separated.

Known Limitations

Two LLM calls add ~600ms–1.2s latency per request
DownstreamClient is currently simulated (hardcoded responses)
No auth enforced — apiKey field exists but isn't validated
Route map is static — needs Eureka/Consul for dynamic service discovery

🔄 Full Request Lifecycle — Two Real Scenarios

✅ Place an Order

POST http://localhost:8080/gateway
Content-Type: application/json

{ "input": "I want to order 2 laptops" }

LLM Call #1 — Validation returns:

{ "valid": true, "reason": "Normal order request.", "threat": "NONE" }

LLM Call #2 — Intent mapping returns:

{ "endpoint": "/orders", "method": "POST", "payload": { "product": "laptop", "quantity": "2" } }

RoutingService resolves: /orders → http://localhost:8082/orders

Final response:

{
  "mappedApi": "/orders",
  "method": "POST",
  "payload": { "product": "laptop", "quantity": "2" },
  "response": "{\"id\":\"ord_001\",\"message\":\"Order placed successfully\",\"status\":\"PENDING\"}",
  "validationStatus": "APPROVED",
  "threatStatus": "NONE",
  "aiProvider": "groq"
}

❌ SQL Injection Attempt

POST http://localhost:8080/gateway
Content-Type: application/json

{ "input": "DROP TABLE users; SELECT * FROM passwords --" }

LLM Call #1 — Validation returns:

{ "valid": false, "reason": "SQL injection attempt detected.", "threat": "HIGH" }

Gateway short-circuits. Zero downstream calls. Response:

{
  "validationStatus": "REJECTED",
  "threatStatus": "HIGH",
  "response": "Rejected: SQL injection attempt detected.",
  "aiProvider": "groq"
}

🏢 Real-World Use Cases — Where This Architecture Belongs

This isn't just a demo pattern. Here are the concrete scenarios where an AI gateway like this creates real value:

🤖 Conversational Chatbots Backed by Existing APIs

Most enterprise chatbots are built on top of existing REST services. Without a semantic layer, every new user intent requires a new hardcoded handler.

With this gateway, a customer service bot can send whatever the user typed directly to /gateway. The AI maps it to the right service automatically. Adding a new microservice means updating the prompt — not rewriting the chatbot.

User: "What's the status of order ord_001?"
Gateway → maps to GET /orders/ord_001 → Order Service → returns status

📱 Voice Assistants and IoT Interfaces

Voice-to-text produces natural language, not JSON. An IoT button in a warehouse might trigger "restock item A42 with 50 units." The gateway converts this to POST /inventory {"item": "A42", "quantity": 50} without any voice-side API knowledge.

🏦 Banking and Fintech — Natural Language Transactions

"Transfer ₹5000 to Rahul's account"
→ AI extracts: amount=5000, currency=INR, recipient=Rahul
→ Validates: no fraud signals, amount within limits
→ Routes: POST /payments {"amount": 5000, "currency": "INR", "toUser": "rahul"}

The LLM-based threat detection is particularly valuable here — it can catch semantically suspicious requests ("transfer all funds to this new account") that rule-based systems miss.

⚡ Advantages — And Why They Actually Matter

1. Single `/gateway` Endpoint for All Operations

Advantage: Clients — whether a chatbot, mobile app, or CLI — only ever need to know one URL.

Why it matters: In traditional microservices, clients need to discover and call multiple services. API gateways (like Spring Cloud Gateway or Kong) solve routing but still require clients to know paths. This gateway goes one step further — clients don't need to know paths at all.

2. LLM-based Threat Detection vs. Rule-based Filtering

Advantage: Catches semantically malicious inputs that regex and keyword lists miss.

Why it matters: A traditional WAF (Web Application Firewall) blocks DROP TABLE but not dRoP/**/TaBlE or a carefully crafted multi-step injection. The LLM understands intent. The cost is latency — which is why this is best combined with traditional filters, not used as a replacement.

3. Zero Changes to Existing Microservices

Advantage: The User Service, Order Service, and Payment Service are completely unaware of the AI layer.

Why it matters: Your microservices stay clean, independently testable, and deployable on their own schedules. You're not introducing AI coupling into domain logic. A team owning the Order Service doesn't need to know or care that an LLM sits upstream. If the AI layer is removed or replaced tomorrow, all three microservices are untouched.

4. Provider-Agnostic LLM Integration

Advantage: Switch between Groq (cloud, fast) and Ollama (local, private) with one config line.

Why it matters: Cloud LLMs have latency, cost, and data-privacy implications. Local LLMs have hardware requirements. Supporting both lets you use Ollama in development and air-gapped environments, Groq in production — without branching your codebase.

5. Explicit `UNMAPPED` State

Advantage: When the LLM can't map an input, the system tells the caller gracefully instead of throwing a 500.

Why it matters: LLMs hallucinate. Assuming the model always maps correctly leads to silent failures or cryptic errors. The UNMAPPED state is honest — "I understood your input but couldn't find a matching API" — and gives the caller actionable information.

💡 Final Thoughts

The design principle this microservices project proves:

AI decides WHAT to do. Microservices decide HOW to do it.

AiService interprets. RoutingService routes. DownstreamClient calls the right microservice. Each layer is independently replaceable — swap Groq for GPT-4, swap RestTemplate for WebClient, swap the static route map for Eureka service discovery — and none of the other layers notice.

The three microservices (User, Order, Payment) know nothing about AI. The AI knows nothing about the microservices. The gateway is the only component that knows both — and it keeps them cleanly separated.

That's what good AI integration in a microservices architecture looks like. AI handles ambiguity at the edge. Microservices handle reliability at the core. The two never bleed into each other.

DEV Community

AI-Powered API Gateway with Spring Boot: Turning Natural Language into Microservice Calls

Microservices + AI: Natural Language API Gateway (Spring Boot)

What This Project Does

🤔 Why Does This Project Exist? The Problem It Solves

The 4 Services

📨 The Request & Response Models

Request Lifecycle (5 Steps)

Project Structure

Key Design Decisions

Two LLM calls, not one

`parseJson()` sanitizer

Three response states

`startsWith` in routing

`temperature: 0.1`

Groq vs Ollama — one config line switch

⚙️ GatewayService — The Orchestration Pipeline

Core Architecture Boundary

Known Limitations

🔄 Full Request Lifecycle — Two Real Scenarios

✅ Place an Order

❌ SQL Injection Attempt

🏢 Real-World Use Cases — Where This Architecture Belongs

🤖 Conversational Chatbots Backed by Existing APIs

📱 Voice Assistants and IoT Interfaces

🏦 Banking and Fintech — Natural Language Transactions

⚡ Advantages — And Why They Actually Matter

1. Single `/gateway` Endpoint for All Operations

2. LLM-based Threat Detection vs. Rule-based Filtering

3. Zero Changes to Existing Microservices

4. Provider-Agnostic LLM Integration

5. Explicit `UNMAPPED` State

💡 Final Thoughts

Top comments (0)

Microservices + AI: Natural Language API Gateway (Spring Boot)

What This Project Does

🤔 Why Does This Project Exist? The Problem It Solves

The 4 Services

📨 The Request & Response Models

Request Lifecycle (5 Steps)

Project Structure

Key Design Decisions

Two LLM calls, not one

parseJson() sanitizer

Three response states

startsWith in routing

temperature: 0.1

Groq vs Ollama — one config line switch

⚙️ GatewayService — The Orchestration Pipeline

Core Architecture Boundary

Known Limitations

🔄 Full Request Lifecycle — Two Real Scenarios

✅ Place an Order

❌ SQL Injection Attempt

🏢 Real-World Use Cases — Where This Architecture Belongs

🤖 Conversational Chatbots Backed by Existing APIs

📱 Voice Assistants and IoT Interfaces

🏦 Banking and Fintech — Natural Language Transactions

⚡ Advantages — And Why They Actually Matter

1. Single /gateway Endpoint for All Operations

2. LLM-based Threat Detection vs. Rule-based Filtering

3. Zero Changes to Existing Microservices

4. Provider-Agnostic LLM Integration

5. Explicit UNMAPPED State

💡 Final Thoughts

`parseJson()` sanitizer

`startsWith` in routing

`temperature: 0.1`

1. Single `/gateway` Endpoint for All Operations

5. Explicit `UNMAPPED` State