VIN decoding sounds straightforward until you realize every manufacturer implements it differently, and the authoritative database (NHTSA VPIC) only covers US-market vehicles.
We maintain Corgi, an open-source offline VIN decoder. To handle international vehicles, we built a community contribution system. Here's how it works.
The Problem
A VIN (Vehicle Identification Number) is 17 characters:
- Positions 1-3: WMI (World Manufacturer Identifier)
- Positions 4-8: Vehicle attributes (model, engine, etc.)
- Position 9: Check digit
- Position 10: Model year
- Position 11: Plant code
- Positions 12-17: Serial number
NHTSA's VPIC database maps WMIs to manufacturers and defines how to interpret positions 4-8. But it only covers vehicles sold in the US.
Tesla's Shanghai factory uses WMI LRW. Tesla's Berlin factory uses XP7. Neither exist in VPIC.
Our Solution: YAML Patterns
Contributors create a YAML file:
wmi: LRW
manufacturer: Tesla
make: Tesla
country: China
vehicle_type: Passenger Car
years:
from: 2020
to: null
sources:
- type: service_manual
description: Tesla Model 3/Y VIN decoder
url: https://example.com/source
patterns:
- pattern: "3E****"
element: Model
value: Model 3
- pattern: "YG****"
element: Model
value: Model Y
- pattern: "*D****"
element: Drive Type
value: AWD
- pattern: "*C****"
element: Drive Type
value: RWD
test_vins:
- vin: LRW3E7FA6NC433523
expected:
make: Tesla
model: Model 3
drive_type: RWD
Validation Pipeline
When a PR is opened, CI runs:
- Schema Validation (Zod)
const wmiFileSchema = z.object({
wmi: z.string().length(3).regex(/^[A-HJ-NPR-Z0-9]{3}$/),
manufacturer: z.string().min(1),
make: z.string().min(1),
country: z.string().min(1),
vehicle_type: z.enum(VALID_VEHICLE_TYPES),
years: yearsSchema,
patterns: z.array(patternSchema).min(1),
test_vins: z.array(testVinSchema).min(1),
})
- Check Digit Verification
Every test VIN must have a valid check digit (position 9):
function validateCheckDigit(vin: string): boolean {
const weights = [8,7,6,5,4,3,2,10,0,9,8,7,6,5,4,3,2]
const transliteration = { A:1, B:2, ... }
let sum = 0
for (let i = 0; i < 17; i++) {
const value = /\d/.test(vin[i])
? parseInt(vin[i])
: transliteration[vin[i]]
sum += value * weights[i]
}
const expected = sum % 11
return vin[8] === (expected === 10 ? 'X' : String(expected))
}
- Pattern Matching
Test VINs must decode to expected values using the defined patterns.
Build Time Merge
At release, a script:
- Reads all YAML files from community/wmi/
- Validates each file
- Resolves references (e.g., "Model 3" → Model table ID)
- Inserts into SQLite database
- Compresses database for distribution
// Simplified apply logic
for (const pattern of schema.patterns) {
const elementId = resolveElement(db, pattern.element)
const attributeId = resolveAttribute(db, pattern.element, pattern.value)
db.prepare(`
INSERT INTO Pattern (VinSchemaId, Keys, ElementId, AttributeId)
VALUES (?, ?, ?, ?)
`).run(vinSchemaId, pattern.pattern, elementId, attributeId)
}
Why This Approach?
YAML over SQL:
- Readable diffs in PRs
- Contributors don't need database knowledge
- Easy to validate structure
Test VINs required:
- Catches pattern errors before merge
- Documents expected behavior
- Enables regression testing
Build-time merge:
- Published package includes everything
- No runtime fetching of community data
- Single source of truth
Results
First community patterns: Tesla Shanghai (LRW) and Berlin (XP7) with full trim and drivetrain detection.
npx @cardog/corgi decode LRWYGCEK1PC550123
Make: Tesla
Model: Model Y
Trim: Long Range
Drive: AWD
Country: CHINA
City: SHANGHAI
Contributing
We're looking for patterns for:
- Chinese EVs (BYD, NIO, XPeng)
- European market variants
- JDM vehicles
See the https://github.com/cardog-ai/corgi/blob/master/community/CONTRIBUTING.md.
Links:
Top comments (0)