String split in Python

#python #beginners

In this article we simply talk about two methods of text parsing in Python. What we will do is given a string like

>>> line = 'aaa bbb ccc'

Split it into substrings, create strings based on this string.

Slice string

The first method is fragment by fragment. Define the recording offset, and then extract the desired string. [start:end]. Example:

>>> line = 'aaa bbb ccc'
>>> col1 = line[0: 3]
>>> col3 = line[8:]
>>> col1
'aaa'
>>> col3
'ccc'
>>>

However, this is undoable with a large string. Many developers use the .split () function.

Split function

The split() function turns a string into a list of strings. By default this function splits on spaces, meaning every word in a sentence will be a list item.

>>> line = 'aaa bbb ccc'
>>> a = line.split ( '')
>>> a
[ 'Aaa', 'bbb', 'ccc']
>>> a[0]
'Aaa'
>>> a[1]
'Bbb'
 >>> a[2]
'Ccc'
 >>>

You can split on character in the string, by setting the character in the split function. This can be a comma, a dash, a semicolon or even a dot (phrases).

>>> line = 'aaa, bbb, ccc'
>>> a = line.split(',')
>>> a
[ 'Aaa', 'bbb', 'ccc']
>>>

Top comments (1)

Kamonwan Achjanis • Aug 4 '23

Just don't use split to break text into words. It will not work well with punctuation or Asian languages like Chinese or Japanese. In JavaScript there is a special object for this use case:
dev.to/kamonwan/the-right-way-to-b...

Perhaps, in Python there is something similar?

DEV Community

String split in Python

Slice string

Split function

Top comments (1)

Read next

Simplify Python GUI Development with Buildfy 🚀

Why Most Notion Templates Fail Businesses (And How to Create Ones That Work)

Building AI Chatbots: The Technology Behind Conversational AI

Automating Production-grade multi-node Kubernetes with KUBESPRAY on multipass with Just a single command