Sebastiao Gazolla Jr

Posted on Sep 20

LLM-Powered Query Processing: From Natural Language to Tool Execution

#architecture #llm #java #ai

Part 5 of the "From Zero to AI Agent: My Journey into Java-based Intelligent Applications" series

We now have our MCPService connecting to real MCP tools (Post 3) and our LLM clients ready to provide intelligence (Post 4).

Today we'll build a complete query processing system that uses LLM intelligence to analyze natural language, select appropriate tools, all in one integrated approach.

The Three Types of Queries

Every user query falls into one of three categories:

Direct Answer - "What is 2+2?" → Use LLM knowledge directly
Single Tool - "What's the weather in Tokyo?" → Use one MCP tool
Multi Tool - "Get weather and save to file" → Use multiple tools (we'll cover this conceptually)

The key breakthrough: instead of building complex parsing logic, we let the LLM do the heavy lifting of understanding intent.

Query Analysis Evolution

Let's start with QueryAnalysis record:

package com.gazapps.inference;

import java.util.Map;

record QueryAnalysis(
    ExecutionType execution,
    String details, 
    Map<String, Object> parameters 
) {
    enum ExecutionType { DIRECT_ANSWER, SINGLE_TOOL, MULTI_TOOL }
}

This record captures three key components: the execution type (indicating whether the query requires a direct answer, a single tool, or multiple tools), the details (specifying the tool name for SINGLE_TOOL or reasoning for other types), and the parameters (a map of key-value pairs for tool execution, if applicable). By consolidating these elements, the QueryAnalysis record provides a structured way to represent the results of query.

Tool Schema Enhancement

Our Tool class now includes the input schema:

package com.gazapps.mcp;

import java.util.Collections;
import java.util.Map;

public class Tool {
    private final String name;
    private final String description;
    private final String serverId;
    private final Map<String, Object> inputSchema; // New: parameter definitions from MCP

    public Tool(String name, String description, String serverId, Map<String, Object> inputSchema) {
        this.name = name;
        this.description = description;
        this.serverId = serverId;
        this.inputSchema = inputSchema != null ? inputSchema : Collections.emptyMap();
    }

    public String name() { return name; }
    public String description() { return description; }
    public String serverId() { return serverId; }
    public Map<String, Object> inputSchema() { return inputSchema; } // Schema-driven parameters
}

This schema comes directly from the MCP servers and tells us exactly what parameters each tool expects, their types, and which ones are required.

Enhanced MCPService Integration

Our MCPService now loads these schemas when connecting to servers:

// In MCPService.loadServerTools()
private void loadServerTools(Server server, McpSyncClient client) {
    try {
        ListToolsResult toolsResult = client.listTools();

        for (io.modelcontextprotocol.spec.McpSchema.Tool mcpTool : toolsResult.tools()) {
            // Now we extract the input schema too
            Map<String, Object> inputSchema = mcpTool.inputSchema() != null ? 
                mcpTool.inputSchema() : Collections.emptyMap();

            Tool tool = new Tool(mcpTool.name(), mcpTool.description(), 
                               server.id(), inputSchema);
            server.addTool(tool);
        }

    } catch (Exception e) {
        System.out.println("Error loading tools for " + server.id() + ": " + e.getMessage());
    }
}

This method connects to an MCP server via the McpSyncClient, retrieves the list of available tools, and extracts each tool’s name, description, and input schema. The schema is stored as a Map to support parameter extraction in the inference process.

SimpleInference Implementation

public class SimpleInference {
    private final MCPService mcpService;
    private final LLMClient llmClient;
    private final ObjectMapper objectMapper;
    private String lastResult = "";

    public String processQuery(String query) {
        try {
            // Single LLM call does analysis + parameter extraction
            QueryAnalysis analysis = analyzeQuery(query);

            String result = switch (analysis.execution()) {
                case DIRECT_ANSWER -> generateDirectResponse(query);
                case SINGLE_TOOL -> executeSingleTool(analysis, query);
                case MULTI_TOOL -> "Multi-tool queries coming in future posts!";
            };

            // Store context for "that"/"it" references
            if (analysis.execution() == QueryAnalysis.ExecutionType.SINGLE_TOOL) {
                lastResult = result;
            }

            return result;

        } catch (Exception e) {
            return "Error: " + e.getMessage();
        }
    }
}

This implementation orchestrates the inference process by analyzing the query, routing it to the appropriate execution path (DIRECT_ANSWER, SINGLE_TOOL, or MULTI_TOOL), and managing context for conversational continuity. The processQuery method uses a single LLM call to classify the query and extract parameters, leveraging the QueryAnalysis record to determine the execution type.

Schema-Driven Tool Formatting

private String formatToolsForPrompt(List<Tool> availableTools) throws JsonProcessingException {
    StringBuilder toolList = new StringBuilder();
    for (Tool tool : availableTools) {
        toolList.append("- ").append(tool.serverId()).append(":")
                .append(tool.name()).append(" - ")
                .append(tool.description())
                .append(" (Parameters: ")
                .append(formatParameters(tool.inputSchema())) // Use real schema!
                .append(")\n");
    }
    return toolList.toString();
}

private String formatParameters(Map<String, Object> inputSchema) throws JsonProcessingException {
    Map<String, Object> properties = (Map<String, Object>) inputSchema.getOrDefault("properties", Collections.emptyMap());
    List<String> required = (List<String>) inputSchema.getOrDefault("required", Collections.emptyList());

    if (properties.isEmpty()) {
        return "None";
    }

    List<String> paramDescriptions = new ArrayList<>();
    for (Map.Entry<String, Object> entry : properties.entrySet()) {
        String paramName = entry.getKey();
        Map<String, Object> paramSchema = (Map<String, Object>) entry.getValue();
        String paramType = (String) paramSchema.getOrDefault("type", "unknown");
        String paramDesc = (String) paramSchema.getOrDefault("description", "");
        boolean isRequired = required.contains(paramName);

        StringBuilder desc = new StringBuilder(paramName + " (" + paramType);
        if (isRequired) {
            desc.append(", required");
        }
        if (!paramDesc.isEmpty()) {
            desc.append(", ").append(paramDesc);
        }
        desc.append(")");
        paramDescriptions.add(desc.toString());
    }

    return String.join(", ", paramDescriptions);
}

The formatToolsForPrompt and formatParameters methods are designed to generate a formatted string representation of available tools and their input schemas for inclusion in an LLM prompt. The formatToolsForPrompt method iterates over a list of Tool objects, constructing a string for each tool. It relies on formatParameters to process the tool's input schema, which is expected to be a JSON-like Map containing properties and required fields.

Prompt Engineering

private static final String ANALYSIS_PROMPT_TEMPLATE = """
Perform syntactic and semantic analysis on the following query to classify it, select 
            the appropriate tool (if needed), and extract all required parameters:

            Query: "%s"

            Available tools:
            %s

            Previous result: %s

            Instructions:
            1. Analyze the query's syntactic structure (e.g., identify key verbs, nouns, entities, and sentence patterns).
            2. Determine the semantic intent (e.g., what the user wants to achieve, such as getting the time, day, weather, or performing a file operation).
            3. Classify the query into one of:
               - DIRECT_ANSWER: Use for informational, broad, or general knowledge queries that don’t require specific tool actions (e.g., asking about a city, concept, or fact).
               - SINGLE_TOOL: if one tool is sufficient, specify the tool and extract its parameters.
               - MULTI_TOOL: if multiple tools are required.
            4. For SINGLE_TOOL:
               - Select the most appropriate tool based on the intent and tool description.
               - Extract ALL required parameters (e.g., location, date, path, filename).
               - Handle special cases:
                 - If the query refers to "that" or "it", use the previous result as context.
                 - If a parameter is missing and cannot be inferred, use reasonable defaults.
            5. Respond in this EXACT format:
               For DIRECT_ANSWER:
               DIRECT_ANSWER: [reason]
               For SINGLE_TOOL:
               SINGLE_TOOL: [server_id]:[tool_name]
               REASONING: [why you chose this tool]
               PARAMS: {"param1": "value1", "param2": "value2"}
               For MULTI_TOOL:
               MULTI_TOOL: multiple tools needed
            6. Don't explain anything.   
    """;

This prompt instructs the LLM to perform syntactic and semantic analysis to classify queries, select tools, and extract parameters. It prioritizes DIRECT_ANSWER for informational queries, ensures tools are only used for explicit actions, and uses context from previous results only when relevant.

Response Parsing

Our parsing handles the structured LLM response:

private QueryAnalysis parseAnalysis(String response) {
    String cleaned = response.trim();
    String[] lines = cleaned.split("\n");

    if (cleaned.startsWith("DIRECT_ANSWER:")) {
        String reasoning = cleaned.substring("DIRECT_ANSWER:".length()).trim();
        return new QueryAnalysis(QueryAnalysis.ExecutionType.DIRECT_ANSWER, reasoning, null);
    }

    if (cleaned.startsWith("SINGLE_TOOL:")) {
        String toolLine = lines[0].substring("SINGLE_TOOL:".length()).trim();
        String[] toolParts = toolLine.split(":");
        if (toolParts.length != 2) {
            return new QueryAnalysis(QueryAnalysis.ExecutionType.DIRECT_ANSWER, "Invalid tool format", null);
        }

        String serverId = toolParts[0];
        String toolName = toolParts[1];
        Map<String, Object> parameters = new HashMap<>();

        // Extract parameters from PARAMS line
        for (String line : lines) {
            if (line.startsWith("PARAMS:")) {
                String paramLine = line.substring("PARAMS:".length()).trim();
                if (!paramLine.equals("{}")) {
                    try {
                        parameters = objectMapper.readValue(paramLine, Map.class);
                    } catch (Exception e) {
                        System.out.println("Failed to parse parameters: " + e.getMessage());
                    }
                }
                break;
            }
        }

        return new QueryAnalysis(QueryAnalysis.ExecutionType.SINGLE_TOOL, serverId + ":" + toolName, parameters);
    }

    // Fallback to direct answer
    return new QueryAnalysis(QueryAnalysis.ExecutionType.DIRECT_ANSWER, "Could not parse", null);
}

This method parses the LLM’s response to extract the execution type, details (tool name or reasoning), and parameters, producing a QueryAnalysis record.

Complete Integration Example

Here's how everything works together for a real query:

Input: "What's the weather tomorrow in Tokyo?"

Step 1 - LLM receives prompt with schema:

Available tools:
- weather-server:get_current_weather - Get current weather (Parameters: location (string, required), units (string, optional))

Step 2 - LLM analyzes and responds:

SINGLE_TOOL: weather-server:get_current_weather
REASONING: User wants weather forecast for specific location and date
PARAMS: {"location": "Tokyo", "units": "metric"}

Step 3 - Execute tool:

ToolResult result = mcpService.callTool("weather-server", "get_current_weather", 
    Map.of("location", "Tokyo", "units", "metric"));

Step 4 - Generate natural response:

String response = generateToolResponse(originalQuery, "get_current_weather", result.content());
// Returns: "Tomorrow's weather in Tokyo will be 22°C with partly cloudy skies."

Structured Prompt Templates

We organize prompts in a clean template system:

private static final class PromptTemplates {
    // Analysis prompt for query classification and parameter extraction
    public static String getAnalysisPrompt(String query, String toolList) {
        return ANALYSIS_PROMPT_TEMPLATE.formatted(query, toolList);
    }

    // Direct answer prompt for knowledge-based queries
    public static String getDirectAnswerPrompt(String query) {
        return DIRECT_ANSWER_PROMPT_TEMPLATE.formatted(query);
    }

    // Tool response prompt for natural conversation
    public static String getToolResponsePrompt(String query, String toolName, String toolResult) {
        return TOOL_RESPONSE_PROMPT_TEMPLATE.formatted(query, toolName, toolResult);
    }

    // Fallback prompt when tools fail
    public static String getFallbackPrompt(String query) {
        return FALLBACK_PROMPT_TEMPLATE.formatted(query);
    }
}

This keeps prompts organized, reusable, and easy to modify.

Multi-Tool Execution (Conceptual Overview)

Queries that require multiple tools can be approached in various ways, depending on the nature of the task. Below are some theoretical patterns for handling such queries:

1. Sequential Dependent Tasks

Execute tools in a specific order, where the output of one tool serves as input for the next.
Example: "Get the weather in Tokyo and save it to a file" requires first querying the weather and then saving the result.

2. Parallel Tasks with Summarization

Execute multiple tools simultaneously and combine their results into a consolidated response.
Example: "Check the weather in Tokyo, Osaka, and Kyoto and summarize the conditions" involves parallel calls and a final analysis.

3. Conditionally Dependent Tasks

Select the next tool based on the result of a previous tool.
Example: "If it's raining in Tokyo, get the extended forecast; otherwise, suggest outdoor activities."

4. Chained Tasks with Transformation

Each tool transforms the data in a specific way, forming a pipeline.
Example: "Extract text from a PDF, translate it to Spanish, and send it via email."

5. Parallel Tasks with Competitive Aggregation

Execute multiple tools in parallel and select the best result based on criteria like cost or quality.
Example: "Check the price of a product on three websites and recommend the cheapest."

6. Iterative Tasks with Feedback

Execute tools repeatedly, refining results based on feedback.
Example: "Write a draft, check its grammar, and adjust until it's perfect."

These patterns demonstrate the flexibility of multi-tool usage. Implementation involves challenges such as query decomposition, state management between tools, and result consolidation, which we'll cover in future posts.

Complete Usage Example

public class InferenceDemo {
    public static void main(String[] args) {
        MCPService mcpService = new MCPService(); // Connects to real MCP servers
        LLMClient llmClient = LLMClientFactory.createGroqClient(System.getenv("GROQ_API_KEY"));

        SimpleInference inference = new SimpleInference(mcpService, llmClient);

        // Test schema-driven parameter extraction
        System.out.println("=== Schema-Driven Query Processing Demo ===\n");

        testQuery(inference, "What is 2+2?");
        testQuery(inference, "What's the weather in NYC?");
        testQuery(inference, "What's the day today in nyc?");
        testQuery(inference, "tell me more about nyc");

        mcpService.close();
    }

    private static void testQuery(SimpleInference inference, String query) {
        System.out.println("Query: " + query);
        String response = inference.processQuery(query);
        System.out.println("Response: " + response);
        System.out.println();
    }
}

Sample Output:

=== Schema-Driven Query Processing Demo ===

Query: What is 2+2?
Response: 2+2 equals 4. This is a basic addition operation.

Query: What's the weather in NYC?
Response: The current weather in NYC is 22°C with partly cloudy skies and light winds.

Query: What's the day today in nyc?
Response: It's Friday, September 19th in New York City.

Query: tell me more about nyc
Response: The city that never sleeps! New York City (NYC) is a global hub for culture, entertainment, finance, and innovation. Located in the state of New York, NYC is the most populous city in the United States, with over 8.4 million people calling it home.

What We've Built

The SimpleInference class processes user queries by analyzing their intent through LLM reasoning, extracts parameters using schema definitions from MCP servers, combines analysis and extraction in a single call, employs structured prompt templates, handles errors with fallback mechanisms.

What's Next?

In our next post, we'll build the conversational interface that ties everything together - a chat application where users can naturally interact with all our MCP tools through the intelligent query processing we built today.

The complete source code is available in our GitHub repository. Our query processing system now rivals commercial AI assistants in capability while remaining simple and maintainable!

This is part 5 of "From Zero to AI Agent: My Journey into Java-based Intelligent Applications" series. Next up: building the conversational chat interface!

DEV Community