DEV Community

2x lazymac
2x lazymac

Posted on

Korean Public Data as Clean JSON: No More XML Hell

Korean government data is rich — business registration records, real estate transactions, population statistics, court documents — but the official APIs serve it in mangled XML with inconsistent field names and EUCKR encoding. Here's how to access it cleanly as JSON.

The Problem with Korean Open Data

Korea's public data portal (data.go.kr) has over 70,000 datasets. The problem: most are served via REST APIs that return XML with:

  • EUCKR or CP949 encoding (not UTF-8)
  • Inconsistent field naming (some Korean, some romanized, some mixed)
  • Unpredictable null representations ("", "null", "-", " ")
  • No versioning — endpoints change without notice

What Clean Access Looks Like

// Instead of dealing with this:
// <item><bizNm>삼성전자</bizNm><rprsntvNm>이재용</rprsntvNm></item>

// You get clean, consistent JSON:
const resp = await fetch('https://api.lazy-mac.com/govdata-korea/business', {
  method: 'GET',
  headers: { 'Authorization': 'Bearer YOUR_KEY' },
});

const { businesses } = await resp.json();
// [{ name: "삼성전자", ceo: "이재용", registration_number: "110111-...", ... }]
Enter fullscreen mode Exit fullscreen mode

Available Data Categories

The GovData Korea API normalizes:

  • Business registry — company name, CEO, registration date, business type
  • Real estate — transaction prices, building specs, zoning
  • Court decisions — case summaries, ruling dates (public records only)
  • Population statistics — demographic breakdowns by region
  • Public transport — bus routes, subway schedules, real-time arrivals

A Real Use Case: Competitor Intelligence

import requests

# Find all software companies registered in Seoul's Gangnam district
resp = requests.get("https://api.lazy-mac.com/govdata-korea/business", params={
    "region": "서울특별시 강남구",
    "industry_code": "J62",  # Software development
    "status": "active",
    "limit": 100
})

companies = resp.json()["results"]
print(f"Found {len(companies)} active software companies in Gangnam")
for co in companies[:5]:
    print(f"  {co['name']} — founded {co['registration_date']}")
Enter fullscreen mode Exit fullscreen mode

Data Freshness

Government datasets update on different schedules:

  • Business registration: daily
  • Real estate transactions: monthly
  • Population data: quarterly
  • Court decisions: weekly

The API normalizes update timestamps so you always know the data freshness.

GovData Korea API | Documentation

Top comments (0)