DEV Community

2x lazymac
2x lazymac

Posted on

Korean NLP for AI Agents: Morpheme Analysis and Sentiment via API

Korean text is fundamentally different from English — morphemes attach to stems in ways that make simple tokenization useless. If you're building AI agents that process Korean text, here's how to do it properly.

Why Korean NLP Is Hard

Korean is an agglutinative language. The word "학교에서" means "from school" — but it's one morpheme cluster. Split it naively and you get garbage embeddings.

English NLP assumes space-separated words. Korean doesn't work that way.

Morpheme Analysis

import requests

API_KEY = "your-key"

# Analyze Korean text into morphemes
resp = requests.post("https://api.lazy-mac.com/korean-nlp/morphemes",
    json={"text": "저는 오늘 학교에서 공부했습니다"},
    headers={"Authorization": f"Bearer {API_KEY}"}
)

result = resp.json()
# {
#   "morphemes": [
#     {"token": "저", "pos": "NP", "meaning": "I/me"},
#     {"token": "는", "pos": "JX", "meaning": "topic marker"},
#     {"token": "오늘", "pos": "MAG", "meaning": "today"},
#     {"token": "학교", "pos": "NNG", "meaning": "school"},
#     {"token": "에서", "pos": "JKB", "meaning": "from/at"},
#     {"token": "공부", "pos": "NNG", "meaning": "study"},
#     {"token": "했습니다", "pos": "XSV+EP+EF", "meaning": "did (formal)"}
#   ]
# }
Enter fullscreen mode Exit fullscreen mode

Sentiment Analysis

Standard sentiment models trained on English fail on Korean formality markers. Korean has multiple politeness levels that change the emotional register of text.

resp = requests.post("https://api.lazy-mac.com/korean-nlp/sentiment",
    json={
        "text": "이 제품 정말 별로네요",  # "This product is really not great"
        "include_formality": True
    },
    headers={"Authorization": f"Bearer {API_KEY}"}
)

sentiment = resp.json()
# {
#   "score": -0.72,
#   "label": "negative",
#   "formality": "polite",
#   "confidence": 0.91
# }
Enter fullscreen mode Exit fullscreen mode

Keyword Extraction for Korean Content

For SEO and content analysis, extract meaningful Korean keywords:

resp = requests.post("https://api.lazy-mac.com/korean-nlp/keywords",
    json={
        "text": long_article_text,
        "top_n": 10,
        "pos_filter": ["NNG", "NNP"]  # Nouns only
    },
    headers={"Authorization": f"Bearer {API_KEY}"}
)

keywords = resp.json()["keywords"]
# [{"word": "인공지능", "frequency": 12, "tfidf_score": 0.89}, ...]
Enter fullscreen mode Exit fullscreen mode

Named Entity Recognition

resp = requests.post("https://api.lazy-mac.com/korean-nlp/ner",
    json={"text": "삼성전자 이재용 회장이 서울 본사에서 발표했습니다"},
    headers={"Authorization": f"Bearer {API_KEY}"}
)

entities = resp.json()["entities"]
# [
#   {"text": "삼성전자", "type": "ORG"},
#   {"text": "이재용", "type": "PERSON"},
#   {"text": "서울", "type": "LOC"}
# ]
Enter fullscreen mode Exit fullscreen mode

Korean NLP API | Documentation

Top comments (0)