DEV Community

Sumit Kumar
Sumit Kumar

Posted on

AI-Powered API Gateway with Spring Boot: Turning Natural Language into Microservice Calls

Microservices + AI: Natural Language API Gateway (Spring Boot)

Send plain English like "order 2 laptops" → get a real API call executed.


What This Project Does

You hit one endpoint with a sentence. The gateway figures out which microservice to call, extracts the payload, and executes the request — no JSON crafting needed on the client side.


🤔 Why Does This Project Exist? The Problem It Solves

Every REST API today has the same fundamental friction: the client must know the contract.

To create an order, you need to know:

  • The endpoint is POST /orders (not /order, not /createOrder)
  • The payload needs product and quantity (not item and count)
  • Content-Type must be application/json
  • The server returns 201, not 200 For machine-to-machine communication, this is fine. For humans, chatbots, voice interfaces, or any conversational UI — it's a wall.

This gateway removes that wall. It adds a semantic translation layer that accepts intent in plain English and handles all the contract mapping internally. Your microservices don't change. Your clients don't need to learn your API.


The 4 Services

Service Port Job
AI Gateway 8080 Entry point — takes text, runs AI, routes
User Service 8081 Handles users
Order Service 8082 Handles orders
Payment Service 8083 Handles payments

📨 The Request & Response Models

// RequestDTO.java — what comes in
@Data
public class RequestDTO {
    private String input;   // "Create a new user named Sumit with email sumit@test.com"
    private String apiKey;  // stubbed for future JWT-based auth
}
Enter fullscreen mode Exit fullscreen mode
// ResponseDTO.java — what goes out
@Data
@Builder
public class ResponseDTO {
    private String mappedApi;              // "/users"
    private String method;                 // "POST"
    private Map<String, Object> payload;   // {"name":"Sumit","email":"sumit@test.com"}
    private String response;               // raw downstream service response
    private String validationStatus;       // APPROVED / REJECTED / UNMAPPED
    private String threatStatus;           // NONE / LOW / MEDIUM / HIGH
    private String aiProvider;             // "groq" or "ollama"
}
Enter fullscreen mode Exit fullscreen mode

Request Lifecycle (5 Steps)

Input: "I want to order 2 laptops"

  1. Validate — LLM call #1: Is this a threat? SQL injection? Gibberish?
  2. Short-circuit — If threat is HIGH or input is invalid → reject immediately, no further calls
  3. Map intent — LLM call #2: What API does this map to?
  4. Route/ordershttp://localhost:8082/orders
  5. Execute — HTTP call to Order Service, return response

LLM #1 output:

{ "valid": true, "reason": "Normal order request.", "threat": "NONE" }
Enter fullscreen mode Exit fullscreen mode

LLM #2 output:

{ "endpoint": "/orders", "method": "POST", "payload": { "product": "laptop", "quantity": "2" } }
Enter fullscreen mode Exit fullscreen mode

Final response:

{
  "mappedApi": "/orders",
  "method": "POST",
  "payload": { "product": "laptop", "quantity": "2" },
  "response": "{\"id\":\"ord_001\",\"status\":\"PENDING\"}",
  "validationStatus": "APPROVED",
  "threatStatus": "NONE",
  "aiProvider": "groq"
}
Enter fullscreen mode Exit fullscreen mode

Project Structure

ai-gateway/
└── src/main/java/com/sumit/aigateway/
    ├── controller/GatewayController.java   ← single entry point
    ├── service/
    │   ├── AiService.java                  ← LLM calls (Groq / Ollama)
    │   ├── GatewayService.java             ← orchestration pipeline
    │   └── RoutingService.java             ← maps endpoint → service URL
    ├── client/DownstreamClient.java        ← calls downstream microservices
    ├── filter/RateLimitFilter.java         ← pre-request rate limiting
    └── model/
        ├── RequestDTO.java                 ← input model
        └── ResponseDTO.java                ← output model
Enter fullscreen mode Exit fullscreen mode

Key Design Decisions

Two LLM calls, not one

Validation and mapping are kept separate. If validation fails, the second LLM call never happens — saves cost and latency.

parseJson() sanitizer

LLMs sometimes wrap JSON in markdown fences (json) even when told not to. This method strips them before parsing, preventing a Jackson crash.

Three response states

State Meaning
APPROVED All good — downstream service was called
REJECTED Threat detected — blocked at validation
UNMAPPED Valid input, but LLM couldn't match a known endpoint

UNMAPPED prevents a hallucinated endpoint like /unicorns from causing a 500 error.

startsWith in routing

/users matches both /users (create) and /users/123 (get by ID) — one rule covers all path variants.

temperature: 0.1

Near-deterministic LLM output. Creativity is the enemy when you need consistent, parseable JSON every time.

Groq vs Ollama — one config line switch

app:
  ai-provider: groq   # change to "ollama" for fully local inference
Enter fullscreen mode Exit fullscreen mode
Factor Groq Ollama
Speed ~300–600ms 1–5s (hardware-dependent)
Cost Free tier, pay-per-token Free forever
Data privacy Leaves your network 100% local
Best for Production, low latency Dev, private/air-gapped

Both run the same Llama model — the difference is purely infrastructure.


⚙️ GatewayService — The Orchestration Pipeline

@Slf4j
@Service
public class GatewayService {

    @Value("${app.ai-provider}") private String aiProvider;

    private final AiService aiService;
    private final RoutingService routingService;
    private final DownstreamClient downstreamClient;
    private final ObjectMapper objectMapper = new ObjectMapper();

    public ResponseDTO process(RequestDTO request) throws Exception {
        String userInput = request.getInput();
        log.info("Processing input: {}", userInput);

        // ── Step 1: Validate & detect threats ────────────────────────────
        String validationRaw = aiService.validateRequest(userInput);
        JsonNode validation  = parseJson(validationRaw);

        String threat   = validation.path("threat").asText("NONE");
        boolean isValid = validation.path("valid").asBoolean(true);

        if (!isValid || "HIGH".equals(threat)) {
            return ResponseDTO.builder()
                .validationStatus("REJECTED")
                .threatStatus(threat)
                .response("Rejected: " + validation.path("reason").asText())
                .aiProvider(aiProvider)
                .build();
        }

        // ── Step 2: Map natural language → API intent ─────────────────────
        String mappingRaw    = aiService.mapNaturalLanguageToApi(userInput);
        JsonNode mapping     = parseJson(mappingRaw);

        String endpoint      = mapping.path("endpoint").asText("").trim();
        String method        = mapping.path("method").asText("POST").trim();
        JsonNode payloadNode = mapping.path("payload");

        // ── Step 3: Guard against hallucinated endpoints ──────────────────
        if (endpoint.isEmpty() || !routingService.isKnownEndpoint(endpoint)) {
            return ResponseDTO.builder()
                .validationStatus("UNMAPPED")
                .threatStatus(threat)
                .response("Could not map to a known API. Try: " +
                          "'Create user', 'Place order', 'Process payment'")
                .aiProvider(aiProvider)
                .build();
        }

        // ── Step 4: Build payload map from dynamic JsonNode ───────────────
        Map<String, Object> payloadMap = new HashMap<>();
        if (payloadNode.isObject()) {
            payloadNode.fields().forEachRemaining(e ->
                payloadMap.put(e.getKey(), e.getValue().asText())
            );
        }

        // ── Step 5: Route and execute ─────────────────────────────────────
        String downstreamUrl   = routingService.resolveDownstreamUrl(endpoint);
        String serviceResponse = downstreamClient.call(method, downstreamUrl, payloadMap);

        return ResponseDTO.builder()
            .mappedApi(endpoint)
            .method(method)
            .payload(payloadMap)
            .response(serviceResponse)
            .validationStatus("APPROVED")
            .threatStatus(threat)
            .aiProvider(aiProvider)
            .build();
    }

    private JsonNode parseJson(String raw) throws Exception {
        String clean = raw.replaceAll("(?s)```

json|

```", "").trim();
        int start = clean.indexOf('{');
        int end   = clean.lastIndexOf('}');
        if (start >= 0 && end > start) clean = clean.substring(start, end + 1);
        return objectMapper.readTree(clean);
    }
}
Enter fullscreen mode Exit fullscreen mode

Core Architecture Boundary

Natural Language Input
        │
   [AiService]          ← interprets intent only, no business logic
        │
   Structured JSON       ← AI's job ends here
        │
  [RoutingService]       ← pure Java, maps endpoint strings to URLs
        │
  [DownstreamClient]     ← HTTP caller
        │
  ┌─────────────────────────────────────────┐
  │  User :8081 │ Order :8082 │ Pay :8083   │
  │  No AI code │ No AI code  │ No AI code  │
  └─────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

AI decides WHAT to do. Microservices decide HOW to do it.

The 3 downstream services have zero AI code. The AI has zero knowledge of the services. The gateway is the only component that knows both exist — and keeps them cleanly separated.


Known Limitations

  • Two LLM calls add ~600ms–1.2s latency per request
  • DownstreamClient is currently simulated (hardcoded responses)
  • No auth enforced — apiKey field exists but isn't validated
  • Route map is static — needs Eureka/Consul for dynamic service discovery

🔄 Full Request Lifecycle — Two Real Scenarios

✅ Place an Order

POST http://localhost:8080/gateway
Content-Type: application/json

{ "input": "I want to order 2 laptops" }
Enter fullscreen mode Exit fullscreen mode

LLM Call #1 — Validation returns:

{ "valid": true, "reason": "Normal order request.", "threat": "NONE" }
Enter fullscreen mode Exit fullscreen mode

LLM Call #2 — Intent mapping returns:

{ "endpoint": "/orders", "method": "POST", "payload": { "product": "laptop", "quantity": "2" } }
Enter fullscreen mode Exit fullscreen mode

RoutingService resolves: /ordershttp://localhost:8082/orders

Final response:

{
  "mappedApi": "/orders",
  "method": "POST",
  "payload": { "product": "laptop", "quantity": "2" },
  "response": "{\"id\":\"ord_001\",\"message\":\"Order placed successfully\",\"status\":\"PENDING\"}",
  "validationStatus": "APPROVED",
  "threatStatus": "NONE",
  "aiProvider": "groq"
}
Enter fullscreen mode Exit fullscreen mode

❌ SQL Injection Attempt

POST http://localhost:8080/gateway
Content-Type: application/json

{ "input": "DROP TABLE users; SELECT * FROM passwords --" }
Enter fullscreen mode Exit fullscreen mode

LLM Call #1 — Validation returns:

{ "valid": false, "reason": "SQL injection attempt detected.", "threat": "HIGH" }
Enter fullscreen mode Exit fullscreen mode

Gateway short-circuits. Zero downstream calls. Response:

{
  "validationStatus": "REJECTED",
  "threatStatus": "HIGH",
  "response": "Rejected: SQL injection attempt detected.",
  "aiProvider": "groq"
}
Enter fullscreen mode Exit fullscreen mode

🏢 Real-World Use Cases — Where This Architecture Belongs

This isn't just a demo pattern. Here are the concrete scenarios where an AI gateway like this creates real value:

🤖 Conversational Chatbots Backed by Existing APIs

Most enterprise chatbots are built on top of existing REST services. Without a semantic layer, every new user intent requires a new hardcoded handler.

With this gateway, a customer service bot can send whatever the user typed directly to /gateway. The AI maps it to the right service automatically. Adding a new microservice means updating the prompt — not rewriting the chatbot.

User: "What's the status of order ord_001?"
Gateway → maps to GET /orders/ord_001 → Order Service → returns status
Enter fullscreen mode Exit fullscreen mode

📱 Voice Assistants and IoT Interfaces

Voice-to-text produces natural language, not JSON. An IoT button in a warehouse might trigger "restock item A42 with 50 units." The gateway converts this to POST /inventory {"item": "A42", "quantity": 50} without any voice-side API knowledge.

🏦 Banking and Fintech — Natural Language Transactions

"Transfer ₹5000 to Rahul's account"
→ AI extracts: amount=5000, currency=INR, recipient=Rahul
→ Validates: no fraud signals, amount within limits
→ Routes: POST /payments {"amount": 5000, "currency": "INR", "toUser": "rahul"}
Enter fullscreen mode Exit fullscreen mode

The LLM-based threat detection is particularly valuable here — it can catch semantically suspicious requests ("transfer all funds to this new account") that rule-based systems miss.

⚡ Advantages — And Why They Actually Matter

1. Single /gateway Endpoint for All Operations

Advantage: Clients — whether a chatbot, mobile app, or CLI — only ever need to know one URL.

Why it matters: In traditional microservices, clients need to discover and call multiple services. API gateways (like Spring Cloud Gateway or Kong) solve routing but still require clients to know paths. This gateway goes one step further — clients don't need to know paths at all.

2. LLM-based Threat Detection vs. Rule-based Filtering

Advantage: Catches semantically malicious inputs that regex and keyword lists miss.

Why it matters: A traditional WAF (Web Application Firewall) blocks DROP TABLE but not dRoP/**/TaBlE or a carefully crafted multi-step injection. The LLM understands intent. The cost is latency — which is why this is best combined with traditional filters, not used as a replacement.

3. Zero Changes to Existing Microservices

Advantage: The User Service, Order Service, and Payment Service are completely unaware of the AI layer.

Why it matters: Your microservices stay clean, independently testable, and deployable on their own schedules. You're not introducing AI coupling into domain logic. A team owning the Order Service doesn't need to know or care that an LLM sits upstream. If the AI layer is removed or replaced tomorrow, all three microservices are untouched.

4. Provider-Agnostic LLM Integration

Advantage: Switch between Groq (cloud, fast) and Ollama (local, private) with one config line.

Why it matters: Cloud LLMs have latency, cost, and data-privacy implications. Local LLMs have hardware requirements. Supporting both lets you use Ollama in development and air-gapped environments, Groq in production — without branching your codebase.

5. Explicit UNMAPPED State

Advantage: When the LLM can't map an input, the system tells the caller gracefully instead of throwing a 500.

Why it matters: LLMs hallucinate. Assuming the model always maps correctly leads to silent failures or cryptic errors. The UNMAPPED state is honest — "I understood your input but couldn't find a matching API" — and gives the caller actionable information.


💡 Final Thoughts

The design principle this microservices project proves:

AI decides WHAT to do. Microservices decide HOW to do it.

AiService interprets. RoutingService routes. DownstreamClient calls the right microservice. Each layer is independently replaceable — swap Groq for GPT-4, swap RestTemplate for WebClient, swap the static route map for Eureka service discovery — and none of the other layers notice.

The three microservices (User, Order, Payment) know nothing about AI. The AI knows nothing about the microservices. The gateway is the only component that knows both — and it keeps them cleanly separated.

That's what good AI integration in a microservices architecture looks like. AI handles ambiguity at the edge. Microservices handle reliability at the core. The two never bleed into each other.


Top comments (0)