DEV Community

Takara Taniguchi
Takara Taniguchi

Posted on

ANYTEXT: MULTILINGUAL VISUAL TEXT GENERATION AND EDITING

Yuxiang Tuo, Alibabaのグループ

Abstract
Diffusion model-based text2image
The proposed method can create multilingual characters such as Japanese, English, and Chinese.
Two tasks
Text generation
Text inpainting

Related works
Text generation
Perceptual supervision
OCR VQGAN
TextDiffuser,

Methodology
Text perceptual loss is formulated as:

Image description

where m_p is the feature map from PP-OCRv3.

このPP-OCRv3をフローズンによっているのが味噌

Benchmark
Sensitive accuracy
NED
FID

OCR seems to be an important module.

Conclusion
この論文は日本語や中国語など様々な言語で文字を生成することができるモデル

感想
おもろいなーの気持ち
こういう論文が欲しかった

Top comments (0)