DEV Community

Ricardo Carneiro
Ricardo Carneiro

Posted on

Why is Your Chatbot Saving "Good Morning" as the Customer's Name? πŸ€–πŸ€¦β€β™‚οΈ

The classic struggle of chatbot data extraction, why your complex regex is failing, and how to fix it in 30 seconds using semantic NLU.
tags: webdev, ai, chatbots, javascript


We've all been there. You spend days building a sleek WhatsApp chatbot or a customer service agent. You write what you think is the perfect prompt or input validation.

Then, your bot asks:

"Hi! What is your full name?"

And the user replies:

"Good morning! I'm John Doe."

Your traditional validation or naive regex captures the input, and boomβ€”your CRM database now has a new client officially named "Good morning!" or "Good morning! I'm John Doe."

Even worse, you ask for a Brazilian postal code (CEP) and the user types: "It is 01310-100". Your strict regex fails because it's not strictly digits, or your lazy regex fails to extract it.

Traditional chatbot validation is broken. Relying on complex regex patterns is a maintenance nightmare, and raw LLM prompts are slow, expensive, and prone to hallucinations.


Enter NaLU AI: Semantic NLU Validation in 30 Seconds

I got tired of these universal chatbot struggles, so I built NaLU AI. It’s a lightweight API and MCP Server designed specifically to clean, structure, and validate conversational data in real-time.

NaLU combines a fast deterministic layer with semantic LLM validation in multiple languages (English, Portuguese, Spanish). Instead of processing raw strings, it understands context.

Let's see it in action.


The Code πŸ› οΈ

Here is how you can easily validate and extract a clean full name from a conversation using a simple JavaScript fetch (or standard cURL):

const response = await fetch('https://api.naluai.dev/v1/extract/name', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_TOKEN',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    agent_input: 'Good morning! Whats your name?',
    user_input: 'Good morning!',
    language: 'pt-BR'
  })
});

const result = await response.json();
console.log(result);

Enter fullscreen mode Exit fullscreen mode

The JSON Response:



{
  "obtained": false,
  "extracted_value": "",
  "confidence": "high",
  "certain": false,
  "reasoning": false,
  "suggestion_to_agent": "Good morning again! Could you please tell me your name?",
  "validator_used": "validate_name",
  "engine": "llm"
}

Enter fullscreen mode Exit fullscreen mode

Notice that:

It automatically filtered out the greeting ("Good morning isnΒ΄t a valid name").
It suggest a "try-again-message".

13 Built-in Semantic Validators
NaLU AI comes with 13 ready-to-use validators:

validate_name (extracts clean proper names, ignores titles and greetings)
validate_cpf / validate_cnpj (validates Brazilian documents using mod 11 check)
validate_cep (extracts postal codes and returns enriched address data)
validate_handoff (detects if the user wants to speak to a human, measuring urgency from 1 to 3)
validate_reply (analyzes conversational context like counter-proposals or indirect answers)
Built for n8n, Make, Cursor & Claude Code
Since it is a standard REST API, it integrates out of the box with no-code tools like n8n and Make.

Even cooler, it exposes itself as an MCP Server, allowing you to add it directly to Cursor or Claude Code to perform semantic tasks locally!

Give it a try! πŸš€
NaLU AI is free to start. The free tier gives you 3,000 free credits per month (no credit card required), and paid plans start at less than a fraction of a cent per validation (R0,0058/ 0.001 USD).

Stop losing clients to bad chatbot regex. Clean your database and make your agents truly smart in 30 seconds.

πŸ‘‰ Test it out in the playground: naluai.dev

Let me know in the comments how you are currently handling user data extraction in your chatbot webhooks!

Top comments (0)