SupplyGraph.AI

Posted on Dec 4, 2025

Building an AI-Powered HTS Classifier: A Technical Breakdown of SupplyGraph AI’s Customs Classification Agent

#ai #code #supplychain #htscode

HTS classification looks simple on paper — but anyone working in trade automation knows it isn’t. It’s a problem full of ambiguity, nested rules, overlapping categories, and legally precise exceptions. And despite being a high-impact step in global trade, most teams still rely on manual lookup and inconsistent human interpretation.

In this post, I’ll walk through how we engineered the Customs Classification Agent at SupplyGraph AI using a hybrid retrieval + constrained-reasoning approach, exposed through our A2A (Agent-to-Agent) protocol. This is a technical deep dive intended for developers, not a marketing overview.

You'll learn:

what makes HTS classification uniquely difficult
why embeddings and raw LLM prompts fail
how our retrieval layer, scoring engine, and reasoning layer work together
how the A2A protocol structures task execution
how to run the agent using cURL or Python SDK
where to find the SDK and examples on GitHub

Let’s get into it.

Why HTS Classification Is a Hard Engineering Problem

HTS classification has several properties that make it a non-trivial task for search, rule-based, and ML systems:

legal text doesn’t follow natural-language conventions
category boundaries are often implied, not explicit
products may match multiple plausible codes
correctness depends on combinations of attributes (material + form + use)
legal notes introduce exceptions, inclusions, and cross-references
tariff schedules update regularly and must remain version-locked for audits

From an engineering perspective, this resembles hierarchical, legal-style reasoning, rather than simple tagging.

Why common approaches fall short

Embeddings alone

They struggle with:

hierarchical structure
exclusion rules
multi-attribute dependencies
long-range legal references

Raw LLM classification

Common failure cases:

hallucinating subheadings
reasoning without grounding in legal text
no version control
no reproducibility
no audit trail

HTS classification benefits from structured grounding, deterministic protocol behavior, and explainability—not just LLM power.

⚙ Architecture Overview

Our classification pipeline:

Product Description
       ▼
 Pre-processing
       ▼
 Attribute Extraction
       ▼
 Candidate Retrieval
 (HTS text + notes + enriched nodes)
       ▼
 Candidate Scoring
       ▼
 Constrained Reasoning
 (grounded evaluation, no free-form generation)
       ▼
 Ranked HTS Classification

Engineering clarifications

The retrieval layer searches a structured dataset containing HTS text + legal notes + derived attribute nodes, not a million HTS entries.
The reasoning layer is LLM-based but constrained, operating strictly over grounded candidate data.
Confidence score is a normalized ranking score, not a calibrated statistical probability.
Every output is tied to a specific tariff dataset version for audit reproducibility.

🔌 The A2A Protocol: How the Agent Exposes Its Interface

A2A defines:

stable event types
deterministic protocol flow
explicit state transitions (interpreting → executing → completed)
optional SSE streaming
the WAITING_USER step when multiple interpretations are possible

Agents expose:

/manifest
/run  (mode=run | status | results)

This keeps agent interactions simple but predictable.

📄 Manifest Example

GET /api/v1/agents/tariff_classification/manifest

{
  "agent_id": "tariff_classification",
  "name": "Customs Classification Agent",
  "version": "1.0.0",
  "description": "Maps product descriptions to HS/HTS codes.",
  "pricing": { "per_run": 2, "unit": "credits" },
  "input_schema": {...},
  "output_schema": {...}
}

🚀 Running a Classification Task

Non-streaming

curl -X POST https://agent.supplygraph.ai/api/v1/agents/tariff_classification/run \
  -H "Authorization: Bearer <API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
        "mode": "run",
        "text": "Knitted cotton T-shirt for women"
      }'

Example Output

{
  "success": true,
  "code": "TASK_COMPLETED",
  "data": {
    "content": {
      "type": "result",
      "data": {
        "classification_results": [
          {
            "hts_code": "6109.10.00.40",
            "confidence_score": 0.90,
            "reasoning": "Identified as knitted cotton T-shirt for women.",
            "description": "T-shirts, singlets, tank tops... of cotton; women's or girls'"
          }
        ],
        "country_of_origin": "Mexico"
      }
    }
  }
}

✔ Note:

country_of_origin only appears when explicitly present in the input text.

The agent does not infer origin automatically.

📡 Streaming Reasoning via SSE

Enable streaming with:

stream=true

Example events:

event: stream
data: { "stage": "interpreting", "reasoning": ["Extracting product attributes..."] }

event: stream
data: { "stage": "executing", "code": "TASK_ACCEPTED" }

event: end
data: [DONE]

Helpful for interactive UIs and debugging.

Python SDK Example

Repo: https://github.com/SupplyGraphAI/supplygraphai_a2a_sdk

from supplygraph.compat import OpenAICompatibleClient

client = OpenAICompatibleClient(api_key="sg-xxx")

response = client.agents.run(
    "tariff_classification",
    text="Machine-cut aluminum sheets, thickness 2.5mm"
)

print(response)

Streaming:

for event in client.agents.stream("tariff_classification", text="..."):
    print(event)
}

📁 Output Structure

Field	Meaning
`hts_code`	Suggested classification
`confidence_score`	Ranking score (normalized 0–1)
`reasoning`	Constrained, grounded explanation
`description`	Official HTS text
`country_of_origin`	Included only when present in input

Outputs are designed for direct integration with:

duty estimation
ERP flows
compliance review
audit logging

🔍 Why This Hybrid Approach Works

Problem	Our solution
Embeddings miss legal boundaries	Structured retrieval
LLMs hallucinate	Constrained reasoning
No repeatability	Protocol-level determinism
No audit trail	Version-locked datasets
Ambiguous inputs	`WAITING_USER` clarification

This yields a classifier that’s transparent, explainable, and suitable for production environments.

🔗 GitHub & Resources

Python SDK

https://github.com/SupplyGraphAI/supplygraphai_a2a_sdk

All repositories

https://github.com/SupplyGraphAI

Contributions, issues, and discussions are welcome.

🏁 Closing Thoughts

HTS classification is deceptively difficult because it blends legal-style logic with multi-attribute categorization. Our approach combines retrieval, scoring, and constrained reasoning, then exposes it through a predictable A2A protocol that developers can embed directly into their systems.

As always, final HTS determination depends on complete product details and applicable legal notes.

This agent supports — but does not replace — professional judgment in tariff classification.

If you'd like a deeper dive into the retrieval engine or A2A internals, I’m happy to write a Part 2.

DEV Community