Building Safe AI: Human-in-the-Loop Workflows and Guardrails in TypeScript
As AI adoption accelerates, the focus has rightly shifted from merely building intelligent systems to building safe and reliable ones. Large Language Models (LLMs), while powerful, can sometimes hallucinate, generate toxic or biased outputs, or incur unexpected costs. For AI applications moving into production, especially in high-stakes environments, robust guardrails and human oversight are not just good practice—they're essential.
In this article, we'll explore how NeuroLink, the universal AI SDK for TypeScript, empowers developers to implement critical safety mechanisms like Human-in-the-Loop (HITL) workflows, comprehensive guardrails, and structured output validation.
Why AI Guardrails Matter in Production
Deploying AI without adequate safety measures can lead to several challenges:
- Hallucinations & Inaccuracy: LLMs can confidently generate factually incorrect information, which can be detrimental in applications requiring precision (e.g., medical, financial).
- Toxic & Biased Output: AI models can inadvertently produce harmful, offensive, or biased content, damaging brand reputation and violating ethical guidelines.
- Security Risks: Uncontrolled AI actions can lead to data breaches, unauthorized access, or unintended modifications to systems.
- Cost Overruns: Without proper controls, AI can make expensive API calls or trigger resource-intensive operations, leading to ballooning infrastructure costs.
- Compliance & Regulatory Headaches: Industries with strict regulations (finance, healthcare) require auditable and controllable AI systems to ensure compliance.
These challenges highlight the need for a multi-layered approach to AI safety, ensuring human oversight and automated checks are integrated throughout the AI workflow.
Human-in-the-Loop (HITL) Workflows: Ensuring Critical Oversight
Human-in-the-Loop (HITL) workflows are paramount for high-stakes AI operations. They introduce a deliberate pause in the AI's execution flow, requiring explicit human approval before sensitive or irreversible actions are taken. Think of it as an "Are you sure?" dialog for your AI agent.
NeuroLink provides a powerful, event-driven HITL system. You can mark any tool as requiring confirmation, and NeuroLink will handle the interruption and event emission to your application.
Implementing HITL with NeuroLink
Consider a scenario where your AI assistant needs to delete a file. This is a risky operation that should always involve human consent.
import { NeuroLink } from "@juspay/neurolink";
const neurolink = new NeuroLink();
// Define a risky tool that requires human approval
const deleteFileTool = {
name: "deleteFile",
description: "Deletes a file from the filesystem",
requiresConfirmation: true, // Crucial for HITL
parameters: {
type: "object",
properties: {
filePath: { type: "string", description: "Path to the file to delete" },
},
required: ["filePath"],
},
execute: async (args: { filePath: string }) => {
console.log(`Deleting file: ${args.filePath}`);
// In a real application, you'd perform the actual file deletion here
// For now, we'll just simulate it.
return `File '${args.filePath}' marked for deletion.`;
},
};
neurolink.addTool(deleteFileTool);
// Your application listens for confirmation requests
neurolink.on("hitl:confirmation-request", async (event) => {
const { confirmationId, toolName, arguments: args } = event.payload;
console.warn(`AI wants to execute '${toolName}' with args:`, args);
// In a real UI, you'd present a dialog to the user
const approved = await new Promise<boolean>((resolve) => {
// Simulate user approval after a delay
setTimeout(() => {
console.log(`Human approved '${toolName}'.`);
resolve(true); // User approved
}, 3000);
});
// Send the human's decision back to NeuroLink
neurolink.emit("hitl:confirmation-response", {
type: "hitl:confirmation-response",
payload: {
confirmationId,
approved,
metadata: {
timestamp: new Date().toISOString(),
responseTime: Date.now(),
},
},
});
});
async function runAITask() {
console.log("AI attempting to delete a sensitive file...");
try {
const result = await neurolink.generate({
input: { text: "Please delete the file located at /data/sensitive_report.pdf" },
model: "claude-3-opus-20240229",
tools: { deleteFile: deleteFileTool },
});
console.log("AI response:", result.content);
} catch (error: any) {
if (error.message.includes("USER_CONFIRMATION_REQUIRED")) {
console.log("AI paused for human confirmation (as expected).");
} else {
console.error("An unexpected error occurred:", error);
}
}
}
runAITask();
In this example, requiresConfirmation: true on the deleteFile tool tells NeuroLink to pause execution and emit a hitl:confirmation-request event. Your application then handles this event, presents it to a human, and sends back a hitl:confirmation-response indicating approval or denial. This ensures critical actions are always vetted.
Guardrails Middleware: Automated Content Safety
Beyond explicit human approvals, automated content guardrails are crucial for preventing undesirable AI outputs. NeuroLink's middleware system provides a flexible way to filter, redact, or block content based on predefined rules or even other AI models.
Types of Guardrails
- Keyword-based Filtering: Fast, regex-based detection and redaction of profanity, PII patterns, or custom blacklisted terms.
- Model-based Filtering: Leverages a separate, lightweight AI model to evaluate the safety or compliance of generated content, offering more nuanced detection than keywords alone.
Implementing Guardrails with NeuroLink
NeuroLink's guardrails are easily configurable via its middleware system. You can enable a security preset for common filters or define custom rules.
import { NeuroLink } from "@juspay/neurolink";
const neurolinkWithGuardrails = new NeuroLink({
middleware: {
preset: "security", // Enables guardrails with default config
middlewareConfig: {
guardrails: {
enabled: true,
config: {
badWords: {
enabled: true,
list: ["confidential", "secret", "private-data", "f***"], // Custom sensitive terms
},
modelFilter: {
enabled: true,
filterModel: "gpt-4o-mini", // Use a fast, cheap model for safety evaluation
},
},
},
},
},
});
async function testGuardrails() {
console.log("\nTesting guardrails...");
// Example 1: Profane content
const response1 = await neurolinkWithGuardrails.generate({
input: { text: "Write about a really bad word like f*** you" },
model: "claude-3-opus-20240229",
});
console.log("Response with profanity:", response1.content); // Expected: filtered
// Example 2: Sensitive information
const response2 = await neurolinkWithGuardrails.generate({
input: { text: "Tell me something confidential about our company's private-data strategy." },
model: "claude-3-opus-20240229",
});
console.log("Response with sensitive terms:", response2.content); // Expected: filtered or redacted
// Example 3: Safe content
const response3 = await neurolinkWithGuardrails.generate({
input: { text: "What is the capital of France?" },
model: "claude-3-opus-20240229",
});
console.log("Response with safe content:", response3.content); // Expected: normal output
}
testGuardrails();
With preset: "security", NeuroLink automatically applies common filtering. You can extend this with badWords.list for custom terms and modelFilter for AI-powered safety checks, ensuring that even nuanced unsafe content is caught before reaching end-users.
Budget Limits and Cost Guardrails
While NeuroLink doesn't have a direct "budget limit" parameter at the SDK level, its flexible architecture allows you to implement cost guardrails through:
- Custom Middleware: You can create middleware that intercepts
generate()orstream()calls, checks the estimated token usage against a budget, and either blocks the call or alerts an administrator if it exceeds a threshold. - Tool Enforcement: If certain tools are expensive, you can either mark them with
requiresConfirmation: trueto involve human approval or wrap them in custom logic that checks a budget before execution. - Intelligent Routing: NeuroLink's MCP enhancements include tool routing, which can be configured to prioritize cheaper models or tools for certain tasks, reducing overall costs.
Structured Output Validation with Zod Schemas
Ensuring AI outputs adhere to a specific structure is vital for programmatic consumption and to prevent malformed responses that could break downstream systems. NeuroLink integrates seamlessly with Zod for robust, type-safe schema validation.
import { z } from "zod";
import { NeuroLink } from "@juspay/neurolink";
const neurolink = new NeuroLink();
// Define a Zod schema for the expected AI output
const ProductSchema = z.object({
productName: z.string().describe("The name of the product"),
category: z.string().describe("The product category"),
priceUSD: z.number().positive().describe("The price in USD"),
features: z.array(z.string()).describe("A list of key features"),
description: z.string().optional().describe("A short description of the product"),
});
async function generateStructuredProduct() {
console.log("\nGenerating structured product data...");
try {
const result = await neurolink.generate({
input: { text: "Create a detailed product listing for a new smart coffee maker." },
schema: ProductSchema, // Apply the Zod schema
output: { format: "json" }, // Specify JSON output format
model: "claude-3-opus-20240229",
});
// NeuroLink guarantees the output matches the schema,
// or throws an error if validation fails.
const product = JSON.parse(result.content);
console.log("Structured Product Output:", product);
console.log("Product Name (validated):", product.productName);
console.log("Features (validated):", product.features);
} catch (error) {
console.error("Error generating structured output:", error);
}
}
generateStructuredProduct();
// Important Note on Gemini Providers:
// Google Gemini models (Vertex AI, Google AI Studio) currently cannot combine
// function calling (tools) with structured output (JSON schema validation)
// in the same request due to an API limitation. If using Gemini with schemas,
// you must set `disableTools: true` in your NeuroLink.generate() call.
// NeuroLink supports workarounds like a two-step approach or using other providers.
By providing a schema and setting output: { format: "json" }, NeuroLink ensures the AI's response strictly adheres to your defined structure. This is critical for data integrity and predictable behavior in automated pipelines.
Practical Patterns: Combining for Comprehensive Safety
Building truly safe AI often involves combining these techniques:
- High-Stakes Actions: Use HITL (
requiresConfirmation: true) for any tool that modifies sensitive data, initiates financial transactions, or impacts external systems. - Content Moderation: Employ guardrails middleware (
preset: "security",badWords.list,modelFilter) on all user-facing AI outputs to prevent toxic or non-compliant content. - Data Consistency: Use structured output (
schema+output: { format: "json" }) whenever AI-generated data needs to be consumed programmatically, ensuring type safety and preventing runtime errors. - Cost Control: Implement custom middleware or wrapper functions around expensive API calls or tools to monitor and enforce budget limits.
By weaving these mechanisms throughout your NeuroLink-powered AI applications, you can create systems that are not only intelligent and efficient but also robustly safe, compliant, and controllable.
NeuroLink — The Universal AI SDK for TypeScript
- GitHub: github.com/juspay/neurolink
- Install:
npm install @juspay/neurolink - Docs: docs.neurolink.ink
- Blog: blog.neurolink.ink — 150+ technical articles
Top comments (0)