Korean NLP for AI Agents: Morpheme Analysis and Sentiment via API

#ai #api #tutorial #korea

Korean text is fundamentally different from English — morphemes attach to stems in ways that make simple tokenization useless. If you're building AI agents that process Korean text, here's how to do it properly.

Why Korean NLP Is Hard

Korean is an agglutinative language. The word "학교에서" means "from school" — but it's one morpheme cluster. Split it naively and you get garbage embeddings.

English NLP assumes space-separated words. Korean doesn't work that way.

Morpheme Analysis

import requests

API_KEY = "your-key"

# Analyze Korean text into morphemes
resp = requests.post("https://api.lazy-mac.com/korean-nlp/morphemes",
    json={"text": "저는 오늘 학교에서 공부했습니다"},
    headers={"Authorization": f"Bearer {API_KEY}"}
)

result = resp.json()
# {
#   "morphemes": [
#     {"token": "저", "pos": "NP", "meaning": "I/me"},
#     {"token": "는", "pos": "JX", "meaning": "topic marker"},
#     {"token": "오늘", "pos": "MAG", "meaning": "today"},
#     {"token": "학교", "pos": "NNG", "meaning": "school"},
#     {"token": "에서", "pos": "JKB", "meaning": "from/at"},
#     {"token": "공부", "pos": "NNG", "meaning": "study"},
#     {"token": "했습니다", "pos": "XSV+EP+EF", "meaning": "did (formal)"}
#   ]
# }

Sentiment Analysis

Standard sentiment models trained on English fail on Korean formality markers. Korean has multiple politeness levels that change the emotional register of text.

resp = requests.post("https://api.lazy-mac.com/korean-nlp/sentiment",
    json={
        "text": "이 제품 정말 별로네요",  # "This product is really not great"
        "include_formality": True
    },
    headers={"Authorization": f"Bearer {API_KEY}"}
)

sentiment = resp.json()
# {
#   "score": -0.72,
#   "label": "negative",
#   "formality": "polite",
#   "confidence": 0.91
# }

Keyword Extraction for Korean Content

For SEO and content analysis, extract meaningful Korean keywords:

resp = requests.post("https://api.lazy-mac.com/korean-nlp/keywords",
    json={
        "text": long_article_text,
        "top_n": 10,
        "pos_filter": ["NNG", "NNP"]  # Nouns only
    },
    headers={"Authorization": f"Bearer {API_KEY}"}
)

keywords = resp.json()["keywords"]
# [{"word": "인공지능", "frequency": 12, "tfidf_score": 0.89}, ...]

Named Entity Recognition

resp = requests.post("https://api.lazy-mac.com/korean-nlp/ner",
    json={"text": "삼성전자 이재용 회장이 서울 본사에서 발표했습니다"},
    headers={"Authorization": f"Bearer {API_KEY}"}
)

entities = resp.json()["entities"]
# [
#   {"text": "삼성전자", "type": "ORG"},
#   {"text": "이재용", "type": "PERSON"},
#   {"text": "서울", "type": "LOC"}
# ]