What’s the easiest way you know of to tokenize an arithmetic expression in javascript? Let’s say you’re building a calculator application, and want this to happen:
console.log(
tokenize('100-(5.4 + 2/3)*5')
)
// ['100', '-', '(', '5.4', '+', '2/3', ')', '*', '5']
Before you reach into your npm module bag-o-tricks, realize that this can be done in one line of javascript using a secret feature of the string split method. Behold:
'100-(5.4+2/3)*5'
.split(/(-|\+|\/|\*|\(|\))/)
.map(s => s.trim())
.filter(s => s !== '')
// ['100', '-', '(', '5.4', '+', '2/3', ')', '*', '5']
Excuse me? What’s that hot mess inside the split
function? Let’s break it down step by step using a few examples of increasing complexity:
Example 1: s.split(/-/)
Pretty obvious: this splits the string s
anywhere it sees the minus sign symbol -
.
'3-2-1'.split(/-/)
// ["3", "2", "1"]
Example 2: s.split(/(-)/)
The only difference from the previous example is the enclosing parens in the regex, which creates a capturing group. Here’s the key point of the entire article: If the regular expression contains capturing parentheses around the separator, then each time the separator is matched, the results of the capturing group are spliced into the output array.
'3-2-1'.split(/(-)/)
// ["3", "-", "2", "-", "1"]
Example 3: s.split(/(-|\+)/)
This builds off the previous example by adding support for the addition symbol \+
. The backslash \
is required to escape the regex. The vertical pipe |
acts as an OR statement (match -
OR +
).
'3-2-1+2+3'.split(/(-|\+)/)
// ["3", "-", "2", "-", "1", "+", "2", "+", "3"]
The Final Boss (tying everything together)
Hopefully, you now have all tools needed to understand .split(/(-|\+|\/|\*|\(|\))/)
. Hope that made sense! Let me know in the comments if you liked this article, or ping me on twitter!
Top comments (0)