DEV Community

Cover image for ๐—ฉ๐—ผ๐—ถ๐—ฐ๐—ฒ ๐—”๐—œ: ๐—ก๐—Ÿ๐—š - ๐—ง๐˜‚๐—ฟ๐—ป๐—ถ๐—ป๐—ด ๐——๐—ฒ๐—ฐ๐—ถ๐˜€๐—ถ๐—ผ๐—ป๐˜€ ๐—œ๐—ป๐˜๐—ผ ๐—ช๐—ผ๐—ฟ๐—ฑ๐˜€
WanjohiChristopher
WanjohiChristopher

Posted on

๐—ฉ๐—ผ๐—ถ๐—ฐ๐—ฒ ๐—”๐—œ: ๐—ก๐—Ÿ๐—š - ๐—ง๐˜‚๐—ฟ๐—ป๐—ถ๐—ป๐—ด ๐——๐—ฒ๐—ฐ๐—ถ๐˜€๐—ถ๐—ผ๐—ป๐˜€ ๐—œ๐—ป๐˜๐—ผ ๐—ช๐—ผ๐—ฟ๐—ฑ๐˜€

Voice AI listens (ASR), understands (NLU), and decides (Dialog Management).

But decisions aren't responses.
The system knows:
โ–ถ๏ธ Action: inform
โ–ถ๏ธ Flight: booked
โ–ถ๏ธ Destination: Paris
โ–ถ๏ธ Date: Dec 20
โ–ถ๏ธ Confirmation: AB123

That's not what we say to a user.

This is where ๐—ก๐—Ÿ๐—š (Natural Language Generation) comes in.

NLG
It transforms structured data into natural speech:
Example:
๐Ÿค– "Great news! Your flight to Paris on December 20th is confirmed. Your confirmation number is AB123. Have a wonderful trip!"

๐—ง๐—ต๐—ฒ ๐—ก๐—Ÿ๐—š ๐—ฃ๐—ถ๐—ฝ๐—ฒ๐—น๐—ถ๐—ป๐—ฒ:
1๏ธโƒฃ ๐—–๐—ผ๐—ป๐˜๐—ฒ๐—ป๐˜ ๐—ฃ๐—น๐—ฎ๐—ป๐—ป๐—ถ๐—ป๐—ด
๐Ÿ”น"What information to convey?"
๐Ÿ”นSelect facts, order them, prioritize.
2๏ธโƒฃ ๐—ฆ๐—ฒ๐—ป๐˜๐—ฒ๐—ป๐—ฐ๐—ฒ ๐—ฃ๐—น๐—ฎ๐—ป๐—ป๐—ถ๐—ป๐—ด
๐Ÿ”น"How to structure it?"
๐Ÿ”นOne sentence or multiple?
๐Ÿ”นCombine facts?
3๏ธโƒฃ ๐—ฆ๐˜‚๐—ฟ๐—ณ๐—ฎ๐—ฐ๐—ฒ ๐—ฅ๐—ฒ๐—ฎ๐—น๐—ถ๐˜‡๐—ฎ๐˜๐—ถ๐—ผ๐—ป
๐Ÿ”น"What exact words to use?" .
๐Ÿ”นGrammar, vocabulary, tone, fluency.

๐—ง๐—ต๐—ฒ ๐—ฒ๐˜ƒ๐—ผ๐—น๐˜‚๐˜๐—ถ๐—ผ๐—ป:
๐Ÿ”นTemplates โ†’ slot-filling.
๐Ÿ”นStatistical โ†’ n-grams, HMMs.
๐Ÿ”นNeural โ†’ Seq2Seq, Transformers.
๐Ÿ”นLLMs โ†’ GPT, Claude (SOTA) .
Below are ๐—ฟ๐—ฒ๐—ฐ๐—ผ๐—บ๐—บ๐—ฒ๐—ป๐—ฑ๐—ฎ๐˜๐—ถ๐—ผ๐—ปs based on use case:
๐Ÿ”นNeed predictability โ†’ Templates.
๐Ÿ”นNeed natural variety โ†’ LLM.
๐Ÿ”นNeed both โ†’ Hybrid (LLM + guardrails).

The difference between a robotic assistant and a delightful one? NLG.

Top comments (0)