DEV Community

Cover image for Building an AI-Powered HTS Classifier: A Technical Breakdown of SupplyGraph AI’s Customs Classification Agent
SupplyGraph.AI
SupplyGraph.AI

Posted on

Building an AI-Powered HTS Classifier: A Technical Breakdown of SupplyGraph AI’s Customs Classification Agent

HTS classification looks simple on paper — but anyone working in trade automation knows it isn’t. It’s a problem full of ambiguity, nested rules, overlapping categories, and legally precise exceptions. And despite being a high-impact step in global trade, most teams still rely on manual lookup and inconsistent human interpretation.

In this post, I’ll walk through how we engineered the Customs Classification Agent at SupplyGraph AI using a hybrid retrieval + constrained-reasoning approach, exposed through our A2A (Agent-to-Agent) protocol. This is a technical deep dive intended for developers, not a marketing overview.

You'll learn:

  • what makes HTS classification uniquely difficult
  • why embeddings and raw LLM prompts fail
  • how our retrieval layer, scoring engine, and reasoning layer work together
  • how the A2A protocol structures task execution
  • how to run the agent using cURL or Python SDK
  • where to find the SDK and examples on GitHub

Let’s get into it.


Why HTS Classification Is a Hard Engineering Problem

HTS classification has several properties that make it a non-trivial task for search, rule-based, and ML systems:

  • legal text doesn’t follow natural-language conventions
  • category boundaries are often implied, not explicit
  • products may match multiple plausible codes
  • correctness depends on combinations of attributes (material + form + use)
  • legal notes introduce exceptions, inclusions, and cross-references
  • tariff schedules update regularly and must remain version-locked for audits

From an engineering perspective, this resembles hierarchical, legal-style reasoning, rather than simple tagging.

Why common approaches fall short

Embeddings alone

They struggle with:

  • hierarchical structure
  • exclusion rules
  • multi-attribute dependencies
  • long-range legal references

Raw LLM classification

Common failure cases:

  • hallucinating subheadings
  • reasoning without grounding in legal text
  • no version control
  • no reproducibility
  • no audit trail

HTS classification benefits from structured grounding, deterministic protocol behavior, and explainability—not just LLM power.


⚙ Architecture Overview

Our classification pipeline:

Product Description
       ▼
 Pre-processing
       ▼
 Attribute Extraction
       ▼
 Candidate Retrieval
 (HTS text + notes + enriched nodes)
       ▼
 Candidate Scoring
       ▼
 Constrained Reasoning
 (grounded evaluation, no free-form generation)
       ▼
 Ranked HTS Classification
Enter fullscreen mode Exit fullscreen mode

Engineering clarifications

  • The retrieval layer searches a structured dataset containing HTS text + legal notes + derived attribute nodes, not a million HTS entries.
  • The reasoning layer is LLM-based but constrained, operating strictly over grounded candidate data.
  • Confidence score is a normalized ranking score, not a calibrated statistical probability.
  • Every output is tied to a specific tariff dataset version for audit reproducibility.

🔌 The A2A Protocol: How the Agent Exposes Its Interface

A2A defines:

  • stable event types
  • deterministic protocol flow
  • explicit state transitions (interpreting → executing → completed)
  • optional SSE streaming
  • the WAITING_USER step when multiple interpretations are possible

Agents expose:

/manifest
/run  (mode=run | status | results)
Enter fullscreen mode Exit fullscreen mode

This keeps agent interactions simple but predictable.


📄 Manifest Example

GET /api/v1/agents/tariff_classification/manifest
Enter fullscreen mode Exit fullscreen mode
{
  "agent_id": "tariff_classification",
  "name": "Customs Classification Agent",
  "version": "1.0.0",
  "description": "Maps product descriptions to HS/HTS codes.",
  "pricing": { "per_run": 2, "unit": "credits" },
  "input_schema": {...},
  "output_schema": {...}
}
Enter fullscreen mode Exit fullscreen mode

🚀 Running a Classification Task

Non-streaming

curl -X POST https://agent.supplygraph.ai/api/v1/agents/tariff_classification/run \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
        "mode": "run",
        "text": "Knitted cotton T-shirt for women"
      }'
Enter fullscreen mode Exit fullscreen mode

Example Output

{
  "success": true,
  "code": "TASK_COMPLETED",
  "data": {
    "content": {
      "type": "result",
      "data": {
        "classification_results": [
          {
            "hts_code": "6109.10.00.40",
            "confidence_score": 0.90,
            "reasoning": "Identified as knitted cotton T-shirt for women.",
            "description": "T-shirts, singlets, tank tops... of cotton; women's or girls'"
          }
        ],
        "country_of_origin": "Mexico"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Note:

country_of_origin only appears when explicitly present in the input text.

The agent does not infer origin automatically.


📡 Streaming Reasoning via SSE

Enable streaming with:

stream=true
Enter fullscreen mode Exit fullscreen mode

Example events:

event: stream
data: { "stage": "interpreting", "reasoning": ["Extracting product attributes..."] }
Enter fullscreen mode Exit fullscreen mode
event: stream
data: { "stage": "executing", "code": "TASK_ACCEPTED" }
Enter fullscreen mode Exit fullscreen mode
event: end
data: [DONE]
Enter fullscreen mode Exit fullscreen mode

Helpful for interactive UIs and debugging.


Python SDK Example

Repo: https://github.com/SupplyGraphAI/supplygraphai_a2a_sdk

from supplygraph.compat import OpenAICompatibleClient

client = OpenAICompatibleClient(api_key="sg-xxx")

response = client.agents.run(
    "tariff_classification",
    text="Machine-cut aluminum sheets, thickness 2.5mm"
)

print(response)
Enter fullscreen mode Exit fullscreen mode

Streaming:

for event in client.agents.stream("tariff_classification", text="..."):
    print(event)
}
Enter fullscreen mode Exit fullscreen mode

📁 Output Structure

Field Meaning
hts_code Suggested classification
confidence_score Ranking score (normalized 0–1)
reasoning Constrained, grounded explanation
description Official HTS text
country_of_origin Included only when present in input

Outputs are designed for direct integration with:

  • duty estimation
  • ERP flows
  • compliance review
  • audit logging

🔍 Why This Hybrid Approach Works

Problem Our solution
Embeddings miss legal boundaries Structured retrieval
LLMs hallucinate Constrained reasoning
No repeatability Protocol-level determinism
No audit trail Version-locked datasets
Ambiguous inputs WAITING_USER clarification

This yields a classifier that’s transparent, explainable, and suitable for production environments.


🔗 GitHub & Resources

Python SDK

https://github.com/SupplyGraphAI/supplygraphai_a2a_sdk

All repositories

https://github.com/SupplyGraphAI

Contributions, issues, and discussions are welcome.


🏁 Closing Thoughts

HTS classification is deceptively difficult because it blends legal-style logic with multi-attribute categorization. Our approach combines retrieval, scoring, and constrained reasoning, then exposes it through a predictable A2A protocol that developers can embed directly into their systems.

As always, final HTS determination depends on complete product details and applicable legal notes.

This agent supports — but does not replace — professional judgment in tariff classification.

If you'd like a deeper dive into the retrieval engine or A2A internals, I’m happy to write a Part 2.

Top comments (0)