vishalmysore

Posted on Jan 27

Agentic AI and Human-in-the-Loop: A Practical Implementation with Java

#agents #ai #java #tutorial

Human-in-the-Loop (HITL) is a critical concept in AI systems where human oversight is integrated into automated processes to ensure accuracy, safety, and ethical decision-making. HITL here acts as a governance layer, not a UI interaction, providing a structured mechanism for human validation in AI workflows. This article explores HITL through practical Java examples, distinguishing between synchronous and asynchronous patterns for different use cases.

https://github.com/vishalmysore/agenticjava/tree/main/src/main/java/io/github/vishalmysore/humaninloop

HITL Patterns: Synchronous vs Asynchronous

HITL can be implemented in two primary patterns:

Synchronous HITL: Human intervention occurs immediately during the AI processing flow, blocking until a decision is made. This is suitable for real-time applications where decisions must be made instantly, such as in interactive systems or safety-critical operations.
Asynchronous HITL: Human review happens decoupled from the main AI flow, often using messaging systems like Kafka. This allows for scalability and non-blocking operations, ideal for batch processing or distributed environments where human response times can vary.

1️⃣ Conceptual Flow (Big Picture)

Think of the system as a governed execution pipeline:

User NLP
↓
Intent Detection
↓
Action Resolution
↓
Human-in-the-Loop Gate
↓
AI Execution
↓
Validated Response

Key idea:
👉 The human gate sits between intent resolution and AI execution, not after the fact.

2️⃣ Runtime Execution Flow (What Actually Happens)

Let’s trace this using the example I provided:

User input:
"what does Vishal like to eat"

Step 1: NLP → Intent Understanding
Input: "what does Vishal like to eat"

The AIProcessor does:

Tokenization

Semantic intent detection

Entity extraction (Vishal)

Maps the query to a candidate action

Result:

Intent: QUERY_PREFERENCE
Target: FoodChoiceAction

🧠 At this stage, no business logic has run yet.

Step 2: Intent → Action Mapping

Using T4A annotations:

@Action(
description = "Returns food preferences for a person"
)
public String getFoodPreference(String name) {
return "Paneer Butter Masala";
}

The framework:

Resolves method signature

Builds prompt context

Prepares invocation metadata

Internally:

Action selected: getFoodPreference
Parameters: { name = "Vishal" }

Step 3: Action → Human-in-the-Loop Intercept 🔥

Before invoking the action, T4A checks:

if (humanInLoop != null) {
FeedbackLoop feedback = humanInLoop.allow(
promptText,
methodName,
params
);
}

This is the critical governance moment.

What the HumanInLoop sees
Prompt: "what does Vishal like to eat"
Method: getFoodPreference
Params: { name = "Vishal" }

Now the human can:

✅ Approve

❌ Reject

✏️ Modify

⏸ Defer (Kafka case)

Step 4: FeedbackLoop Decision
public boolean isAIResponseValid() {
return true;
}

Possible outcomes:

Decision	Result
true	AI execution continues
false	AI response blocked
async	Wait / fallback policy

This is policy enforcement, not logging.

Step 5: AI Action Execution

Only after approval:

String result = getFoodPreference("Vishal");

Result:

"Paneer Butter Masala"

Step 6: Response Returned to User
Final Output → Paneer Butter Masala

This output is now:

Human-approved

Policy-compliant

Safe to return

3️⃣ Sequence Diagram (Mental Model)

Here’s the full lifecycle in a clean sequence:

User
↓
AIProcessor
↓
NLP Engine
↓
Intent Resolver
↓
Action Selector
↓
HumanInLoop.allow()
↓
FeedbackLoop.isAIResponseValid()
↓
Action Execution
↓
Response
↓
User

Or visually:

"what does Vishal like to eat"
↓
Intent detected
↓
Action resolved
↓
🛑 HUMAN CHECKPOINT
↓
Approved? → Yes
↓
Execute action
↓
Return response

4️⃣ Kafka Version (Async HITL Flow)

Now the fancy distributed version:

User Query
↓
Intent → Action
↓
AI Suggestion
↓
Kafka Topic (ai-suggestions)
↓
Human Reviewer (async)
↓
Kafka Topic (human-decisions)
↓
Decision Cache
↓
AI Continues / Stops

Key distinction:

AI does NOT block a human

Human does NOT block AI

Kafka coordinates reality 😄

Thread Blocking Nuance: While the example uses a local cache for simplicity, in a real asynchronous implementation, the main AI thread would handle the 'Wait' state using mechanisms like CompletableFuture or a State Machine to avoid idling and consuming resources while awaiting human decisions.

5️⃣ Why This Design Is Powerful

What is shown here is not “human feedback”.

It’s:

🔐 Governed AI execution

🧭 Policy checkpoints

🧑‍⚖️ Human authority over autonomous agents

📈 Scalable to multi-agent systems

🛡️ Prompt injection prevention (HITL is one of the only foolproof ways to stop an AI from executing a malicious "ignore previous instructions" command)

This exact flow generalizes to:

Tool invocation approval

Agent-to-agent trust boundaries

Regulated AI (finance, health, law)

Explainable AI pipelines

The following examples demonstrate both patterns.

The FoodChoiceExample Class

The FoodChoiceExample class showcases HITL in action. It uses the T4A (Tools for AI) framework to process AI queries while incorporating human feedback loops.

Code Overview

package io.github.vishalmysore.humaninloop;

import com.t4a.detect.FeedbackLoop;
import com.t4a.detect.HumanInLoop;
import com.t4a.predict.PredictionLoader;
import com.t4a.processor.AIProcessingException;
import com.t4a.processor.AIProcessor;
import lombok.extern.java.Log;

import java.util.Map;

/**
 * This class demonstrates the concept of "Human-in-the-Loop" in AI processing.
 * Human-in-the-Loop refers to a system where human intervention is integrated into
 * automated processes to validate, guide, or override AI decisions. In this example,
 * the HumanInLoop interface is implemented to allow human oversight during AI query
 * processing. The 'allow' method is invoked before the AI generates a response,
 * providing an opportunity for human feedback or validation. Here, it logs the
 * invocation details and automatically approves the AI response by returning a
 * FeedbackLoop that validates the response as true.
 *
 * Real-life use cases:
 * - In a personalized meal planning application, AI might suggest food choices based
 *   on user profiles and historical data. However, human-in-the-loop can be crucial
 *   for handling nuanced factors like sudden dietary changes, allergies, cultural
 *   preferences, or emotional states that AI cannot fully predict. For instance,
 *   if the AI suggests a dish containing nuts, the human-in-the-loop mechanism can
 *   prompt for confirmation or override the suggestion to ensure user safety and satisfaction.
 * - In medical diagnostics, AI could analyze patient data to suggest diagnoses or
 *   treatment plans. Human-in-the-loop ensures that qualified medical professionals
 *   review and validate these suggestions, especially in critical cases where AI
 *   might miss subtle symptoms or context-specific factors, preventing potential
 *   misdiagnoses and ensuring patient safety.
 * - In financial investment systems, AI might recommend trades or portfolio adjustments
 *   based on market data and algorithms. Human-in-the-loop allows financial advisors
 *   to review and approve these recommendations, considering broader economic factors,
 *   risk tolerance, and regulatory compliance that AI might not fully account for,
 *   thereby reducing the risk of significant financial losses.
 */
@Log
public class FoodChoiceExample {
    public static void main(String[] args) {
        System.setProperty("tools4ai.properties.path", "io/github/vishalmysore/humaninloop/tools4ai.properties");
        AIProcessor processor = PredictionLoader.getInstance().createOrGetAIProcessor();
        log.info("AIProcessor initialized: " + (processor != null));
        try {
            log.info(processor.query("Hello, world!"));
            Object obj = processor.processSingleAction("what does Vishal like to eat", new HumanInLoop() {
                @Override
                public FeedbackLoop allow(String promptText, String methodName, Map<String, Object> params) {
                    log.info("HumanInLoop invoked for method: " + methodName + " with prompt: " + promptText + " and params: " + params);
                    return new FeedbackLoop() {
                        @Override
                        public boolean isAIResponseValid() {
                            return true;
                        }
                    };
                }

                @Override
                public FeedbackLoop allow(String promptText, String methodName, String params) {
                    log.info("HumanInLoop invoked for method: " + methodName + " with prompt: " + promptText + " and params: " + params);
                    return new FeedbackLoop() {
                        @Override
                        public boolean isAIResponseValid() {
                            return true;
                        }
                    };
                }
            });
            log.info("Response: " + obj);
        } catch (AIProcessingException e) {
            throw new RuntimeException(e);
        }
    }
}

Key Components

HumanInLoop Interface: This interface defines methods for human intervention. The allow methods are called before AI responses are finalized, allowing for logging, validation, or modification.
FeedbackLoop: Returned by the allow methods, this class determines if the AI response is valid. In this example, it always returns true, but in real implementations, it could prompt for human input.
AIProcessor: Handles the AI query processing. The processSingleAction method integrates the HITL mechanism.

How It Works

The program initializes an AIProcessor using the T4A framework.
It processes a query: "what does Vishal like to eat".
Before generating the response, the HumanInLoop.allow method is invoked, logging details.
The FeedbackLoop validates the response (here, automatically approving it).
The final response is logged: "Paneer Butter Masala".

Real-Life Applications

HITL is essential in domains where AI suggestions need human validation:

Meal Planning: Ensures dietary safety and personal preferences.
Medical Diagnostics: Prevents misdiagnoses by requiring professional review.
Financial Investments: Mitigates risks through advisor oversight.

Fancy Implementation with Apache Kafka

For a more scalable and asynchronous HITL system, we can use Apache Kafka to decouple AI processing from human review. Kafka acts as a message broker, allowing AI components to send suggestions to a queue, where human reviewers can consume, review, and respond asynchronously. This is ideal for distributed systems or web applications.

Prerequisites

Apache Kafka installed and running (e.g., via Docker or local setup).
Kafka Java client libraries (e.g., via Maven: org.apache.kafka:kafka-clients).

Architecture Overview

AI Producer: Sends suggestions to a Kafka topic.
Human Consumer/Reviewer: Consumes messages, reviews, and sends back decisions.
Decision Consumer: Processes the final decisions.

Below is a sample Java implementation using Kafka that integrates with the HumanInLoop interface.

// filepath: c:\work\agenticjava\KafkaHumanInLoop.java
import com.t4a.detect.FeedbackLoop;
import com.t4a.detect.HumanInLoop;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.StringDeserializer;
import org.apache.kafka.common.serialization.StringSerializer;

import java.time.Duration;
import java.util.Collections;
import java.util.Map;
import java.util.Properties;
import java.util.Scanner;
import java.util.concurrent.ConcurrentHashMap;

/**
 * Fancy Human-in-the-Loop implementation using Kafka for asynchronous messaging,
 * implementing the HumanInLoop interface.
 */
public class KafkaHumanInLoop implements HumanInLoop {

    private static final String BOOTSTRAP_SERVERS = "localhost:9092";
    private static final String AI_TOPIC = "ai-suggestions";
    private static final String HUMAN_TOPIC = "human-decisions";

    private KafkaProducer<String, String> producer;
    private Map<String, Boolean> decisionCache = new ConcurrentHashMap<>();

    public KafkaHumanInLoop() {
        Properties props = new Properties();
        props.put("bootstrap.servers", BOOTSTRAP_SERVERS);
        props.put("key.serializer", StringSerializer.class.getName());
        props.put("value.serializer", StringSerializer.class.getName());
        this.producer = new KafkaProducer<>(props);

        // Start a background thread to consume human decisions
        startDecisionConsumer();
    }

    @Override
    public FeedbackLoop allow(String promptText, String methodName, Map<String, Object> params) {
        String key = methodName + "-" + System.currentTimeMillis();
        producer.send(new ProducerRecord<>(AI_TOPIC, key, promptText + " | Params: " + params));
        System.out.println("AI suggestion sent to Kafka for review: " + promptText);

        // In a real implementation, you might wait for the decision with a timeout or use callbacks.
        // For simplicity, assume approval after sending; in production, implement fallback policies.
        return new FeedbackLoop() {
            @Override
            public boolean isAIResponseValid() {
                // Check cache or poll; here we assume true for demo
                return decisionCache.getOrDefault(key, true);
            }
        };
    }

    @Override
    public FeedbackLoop allow(String promptText, String methodName, String params) {
        String key = methodName + "-" + System.currentTimeMillis();
        producer.send(new ProducerRecord<>(AI_TOPIC, key, promptText + " | Params: " + params));
        System.out.println("AI suggestion sent to Kafka for review: " + promptText);

        return new FeedbackLoop() {
            @Override
            public boolean isAIResponseValid() {
                return decisionCache.getOrDefault(key, true);
            }
        };
    }

    private void startDecisionConsumer() {
        new Thread(() -> {
            Properties props = new Properties();
            props.put("bootstrap.servers", BOOTSTRAP_SERVERS);
            props.put("group.id", "decision-consumer");
            props.put("key.deserializer", StringDeserializer.class.getName());
            props.put("value.deserializer", StringDeserializer.class.getName());
            props.put("auto.offset.reset", "earliest");

            KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
            consumer.subscribe(Collections.singletonList(HUMAN_TOPIC));

            while (true) {
                ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
                for (ConsumerRecord<String, String> record : records) {
                    // Assume "approved" or "rejected" as value
                    boolean approved = "approved".equalsIgnoreCase(record.value());
                    decisionCache.put(record.key(), approved);
                }
            }
        }).start();
    }

    // Separate method to simulate human reviewer (run in another process/thread)
    public static void simulateHumanReviewer() {
        Properties props = new Properties();
        props.put("bootstrap.servers", BOOTSTRAP_SERVERS);
        props.put("group.id", "human-reviewer");
        props.put("key.deserializer", StringDeserializer.class.getName());
        props.put("value.deserializer", StringDeserializer.class.getName());
        props.put("auto.offset.reset", "earliest");

        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
        consumer.subscribe(Collections.singletonList(AI_TOPIC));

        Scanner scanner = new Scanner(System.in);
        KafkaProducer<String, String> producer = new KafkaProducer<>(props);

        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
            for (ConsumerRecord<String, String> record : records) {
                System.out.println("AI Suggestion: " + record.value());
                System.out.println("Approve? (y/n): ");
                String response = scanner.nextLine();
                String decision = response.equalsIgnoreCase("y") ? "approved" : "rejected";
                producer.send(new ProducerRecord<>(HUMAN_TOPIC, record.key(), decision));
                System.out.println("Decision sent: " + decision);
            }
        }
    }
}

Evolved Asynchronous Implementation

To achieve true non-blocking behavior and properly manage the "Wait" state, we can evolve the implementation using CompletableFuture. This allows the AI thread to suspend and resume without idling, handling lifecycles like timeouts and HTTP request management.

// Add imports
import java.util.UUID;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.TimeUnit;

// Add to the class
private Map<String, CompletableFuture<Boolean>> pendingRequests = new ConcurrentHashMap<>();

// New async method
public CompletableFuture<Boolean> allowAsync(String prompt, String method) {
    String correlationId = UUID.randomUUID().toString();
    producer.send(new ProducerRecord<>(AI_TOPIC, correlationId, prompt));

    CompletableFuture<Boolean> decisionFuture = new CompletableFuture<>();
    pendingRequests.put(correlationId, decisionFuture);

    return decisionFuture.orTimeout(5, TimeUnit.MINUTES).exceptionally(ex -> false); // Default to false on timeout
}

// Update startDecisionConsumer to complete futures
private void startDecisionConsumer() {
    new Thread(() -> {
        Properties props = new Properties();
        props.put("bootstrap.servers", BOOTSTRAP_SERVERS);
        props.put("group.id", "decision-consumer");
        props.put("key.deserializer", StringDeserializer.class.getName());
        props.put("value.deserializer", StringDeserializer.class.getName());
        props.put("auto.offset.reset", "earliest");

        KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
        consumer.subscribe(Collections.singletonList(HUMAN_TOPIC));

        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
            for (ConsumerRecord<String, String> record : records) {
                boolean approved = "approved".equalsIgnoreCase(record.value());
                CompletableFuture<Boolean> future = pendingRequests.remove(record.key());
                if (future != null && !future.isDone()) {
                    future.complete(approved);
                }
            }
        }
    }).start();
}

This ensures the AI task is suspended via the future and resumed when the Kafka decision arrives, preventing indefinite thread blocking.

Strategic Visuals

Here's a simplified architecture diagram for the Kafka-based asynchronous HITL:

[AI Processor]
     |
     | (Send suggestion)
     v
[Kafka Producer] --> [ai-suggestions Topic] --> [Human Reviewer]
     ^                                           |
     | (Send decision)                           |
     +-------------------------------------------+
     |
     v
[Kafka Consumer] --> [CompletableFuture Completion] --> [AI Processor Resumes]

Error Handling & Edge Cases

Timeout Policies: If no human decision arrives within the timeout (e.g., 5 minutes), the system defaults to rejection (safe mode) to prevent unauthorized actions. In progressive setups, it could default to approval with logging.
Conflict Resolution: For duplicate decisions on the same correlation ID, accept the first valid decision and log conflicts for auditing.
Thread Management: Use Java's virtual threads (Project Loom) or an executor service to handle asynchronous waiting, ensuring the main thread pool isn't exhausted during long waits.

Explanation

AI Producer: Simulates the AI sending a suggestion (e.g., "Salad with nuts") to the ai-suggestions topic.
Human Reviewer: Consumes from the topic, prompts for review via console, and sends the final decision to the human-decisions topic.
This setup allows for scalability: multiple reviewers can consume from the queue, and decisions can be processed asynchronously.
Timeouts and Fallbacks: In production, implement timeouts (e.g., 5 minutes) and fallback policies (e.g., auto-approve if no response) to handle cases where human reviewers are unavailable or delayed.

To run this, ensure Kafka is set up and topics are created. This "fancy" implementation demonstrates HITL in a distributed, event-driven architecture, suitable for real-world applications like the use cases mentioned.

Conclusion

The FoodChoiceExample illustrates a simple yet powerful HITL implementation. By integrating human feedback into AI workflows, systems become more reliable and trustworthy. For more advanced setups, consider asynchronous mechanisms like Kafka for scalable HITL in distributed environments.

DEV Community