How I Built an AI Compliance System for Charter Aviation Using RAG and Pinecone
Building RAG for compliance-critical domains is not the same as building RAG for a general-purpose chatbot. When the outputs affect an operator's certification status, wrong answers have real consequences.
This is the full architecture walkthrough of Navlyt — the compliance operating system I built for charter aviation operators.
Navlyt tracks FAA, Transport Canada, and EASA regulatory requirements for small charter aviation operators. It answers compliance questions, monitors obligation status, and generates required documentation.
The challenge: regulatory documents are dense, cross-referenced, and version-controlled in ways that break standard RAG approaches.
This article covers the complete technical implementation — chunking strategy, retrieval architecture, answer generation with citations, and the accuracy validation approach I use for a compliance-critical domain.
The Problem With Standard RAG for Regulatory Content
Regulatory documents have properties that make standard paragraph-level chunking produce poor retrieval results:
Cross-references
A requirement in one section may reference definitions in another. A chunk containing only the requirement produces incomplete context.
Applicability conditions
Whether a regulation applies depends on conditions defined elsewhere. Standard chunking separates requirements from applicability criteria.
Version control
Regulatory documents are amended over time. Retrieval must be version-aware.
Term definitions
Regulatory language uses precise definitions. Example: Air taxi has a specific legal meaning.
The Chunking Strategy
After significant experimentation, I settled on a four-tier chunking approach:
Tier 1 — Section level chunks
Complete sections defining terms or applicability remain intact (200-800 tokens).
Tier 2 — Paragraph level chunks
Individual requirements are chunked at paragraph level with metadata:
- Section number
- Regulation name
- Version
- Applicability category
Tier 3 — Manual summary chunks
Some requirements span multiple sections.
For the most queried requirements I created manual summary chunks combining relevant provisions.
Expensive — but critical for accuracy.
Tier 4 — Cross reference chunks
For chunks with cross-references I create composite chunks including referenced content.
This removes the most common failure:
retrieving a rule without its definition.
Pinecone Index Architecture
I use a single Pinecone index with namespace separation by regulation type.
Example:
A Transport Canada operator asking about pilot currency does not need FAR Part 135 results.
const NAMESPACES = {
transport_canada:'tc_cars',
faa_part_135:'faa_135',
faa_part_91:'faa_91',
easa_cs23:'easa_cs23',
operator_specific:'ops_spec'
}
async function retrieveCompliance(
query:string,
operatorContext:OperatorContext
){
const targetNamespaces =
resolveApplicableNamespaces(operatorContext)
const queryEmbedding =
await embedQuery(query)
const results = await Promise.all(
targetNamespaces.map(ns =>
pinecone
.index('navlyt-regulations')
.namespace(ns)
.query({
vector:queryEmbedding,
topK:5,
includeMetadata:true,
filter:{
is_current:{'$eq':true},
applicability_categories:{
'$in':
operatorContext.certificateCategories
}
}
})
)
)
return mergeAndRerankResults(results,query)
}
The Answer Generation Pipeline
Compliance RAG differs from normal RAG.
Every answer must:
- Cite regulatory provisions
- State applicability conditions
- Flag ambiguity
- Never speculate
- Admit uncertainty
const COMPLIANCE_SYSTEM_PROMPT = `
You are a regulatory compliance assistant.
RULES:
1 Cite regulation sections
2 If unclear say regulations do not clearly address this
3 State applicability
4 Never speculate
5 Flag ambiguity
6 Verify requirements with Transport Canada
`
Accuracy Validation
Standard RAG metrics are insufficient.
I built a regulatory validation framework:
Human expert validation
Worked with a Transport Canada aviation consultant.
Built a 200 question validation set.
Current accuracy: 94.2%
Confidence scoring
Based on:
- Retrieval similarity
- Direct relevance
- Regulatory ambiguity
Human review triggers
Automatic review when:
- Confidence < 0.75
- Regulations unclear
- Recently amended sections
interface ComplianceAnswer{
answer:string
citations:RegulatoryCitation[]
confidence:number
requiresHumanReview:boolean
applicabilityNote?:string
ambiguityWarning?:string
lastRegUpdateCheck:string
}
Key Lessons For Building Compliance RAG
Lesson 1 — Domain experts are mandatory
Could not build this without aviation compliance experts.
Budget for this.
Lesson 2 — Chunk quality matters most
Biggest gains came from improving chunk quality.
Not embedding models.
Lesson 3 — "I don't know" is correct sometimes
Wrong confident answers are dangerous.
Build strong non-answer logic.
Lesson 4 — Regulations require maintenance
Regulations change constantly.
Corpus updates must be part of the system.
Results
Accuracy: 94.2%
Latency: 1.8s
Human review rate: 6.3%
Navlyt is live at navlyt.com
More architecture writing:
tilakraj.info/blog
About the Author
Tilak Raj is CEO & Founder of Brainfy AI.
Building vertical AI SaaS across:
- Agriculture
- Insurance
- Aviation compliance
- Real estate
Shipped 8 AI products.
Writing about AI engineering and SaaS architecture.
Dev.to: dev.to/tilakraj
Website: tilakraj.info
Top comments (0)