DEV Community

lukaszkuczynski
lukaszkuczynski

Posted on

Learning Chinese with Python

Intro

I have been learning Chinese since 2016. Unfortunately it goes slowly but from time to time I receive some task to prepare. Then I need to cover some material in that language. Then text is written in my language (Polish) or English and in Chinese characters. I can barely recognize 20 or 50 of them so Pinyin is a must. And because on a daily basis I am using Python I thought: let us see how to use it to automate my work.

Python is great

It has tools for everything. It took me only minutes to find nice libraries for segmentation, group characters into words (the leader is jieba) and then to use pinyin transliteration (xpinyin is one of the many examples). This is how easy it is to be done in Python

import jieba
from xpinyin import Pinyin

sentence = "我想说更好的中文,但很难,因为我是波兰人"
print(sentence)

segments = jieba.cut(sentence)
output = " ".join(segments)
print(output)

p = Pinyin()
pinyined = p.get_pinyin(output, splitter='', show_tone_marks=True)
print(pinyined)
Enter fullscreen mode Exit fullscreen mode

It will produce output:

我想说更好的中文,但很难,因为我是波兰人
Building prefix dict from the default dictionary ...
Dumping model to file cache /tmp/jieba.cache
Loading model cost 0.899 seconds.
Prefix dict has been built succesfully.
我 想 说 更好 的 中文 , 但 很 难 , 因为 我 是 波兰人
wǒ xiǎng shuō gènghǎo de zhōngwén , dàn hěn nán , yīnwèi wǒ shì bōlánrén
Enter fullscreen mode Exit fullscreen mode

Cool isn't it?
Now I can use it to document processing, thus making my work much faster now.

Top comments (10)

Collapse
 
coolcode profile image
B.M. Lee

Hi, this is BM Lee, the founder of Neehow (nee.how). I hope Neehow can heIp more people learn Chinese, no matter where they are.

Neehow is an educational platform for learning Mandarin Chinese online. It uses artificial intelligence technology and speech recognition algorithms to automatically score users, analyzing from four dimensions: pronunciation, intonation, accuracy, and completeness. Because there is no manual intervention, users can learn anytime, anywhere, without any restrictions.

Neehow 是一个线上学习中文(普通话)的教育平台。它使用人工智能技术和语音识别算法来自动给用户评分,从发音、语调、准确度和完整度四个维度分析。因为没有任何人工干预,用户无论在何时何地都可以学习,没有任何限制。

Collapse
 
samknowscoding profile image
SamKnowsCoding

nee.how not reachable

Collapse
 
creyin profile image
Creyin • Edited

Hi, I’ve also found it difficult but found it interesting learning a new language while working on projects. For Chinese I found it easier to have lessons where other languages I could study these with books instead, especially business Chinese (chineseonlinecourses.com/adults/bu...), because it's very different from others. And it's unlikely you'll need to learn everyday conversation Chinese unless you wish to learn more of course. How about asking to cover certain areas or topics in business Chinese with a 1-on-1 teacher.

Collapse
 
dcsan profile image
dc

is xpinyin better than just basic "pinyin" ?
libraries.io/pypi/pinyin

I'd be keen to know what other libraries you've found for working with chinese/python. Is there a good wrapper for the cedict ?

I'm also looking for a way to grade the difficulty level of Hanzi, do you know of some tool with word frequency already prepared?

btw I'm been working on this site for studying Chinese grammar:
lex.chat/study

Thanks!

Collapse
 
rhymsy profile image
rhymsy

Love it!

Collapse
 
jess profile image
Jess Lee

That's really neat!

Collapse
 
scplay profile image
ZeonWang

加油,I have been learning English for more than ten years, while I could not write down a correct sentence.

Collapse
 
lukaszkuczynski profile image
lukaszkuczynski

谢谢!

Collapse
 
pingsoli profile image
pingsoli

厉害了,我的哥,加油 ...

Collapse
 
owenlittlewhite profile image
Owen

很强!