What’s the easiest way you know of to tokenize an arithmetic expression in javascript? Let’s say you’re building a calculator application, and want this to happen:
console.log(
 tokenize('100-(5.4 + 2/3)*5')
)
// ['100', '-', '(', '5.4', '+', '2/3', ')', '*', '5']
Before you reach into your npm module bag-o-tricks, realize that this can be done in one line of javascript using a secret feature of the string split method. Behold:
'100-(5.4+2/3)*5'
  .split(/(-|\+|\/|\*|\(|\))/)
  .map(s => s.trim())
  .filter(s => s !== '')
// ['100', '-', '(', '5.4', '+', '2/3', ')', '*', '5']
Excuse me? What’s that hot mess inside the split function? Let’s break it down step by step using a few examples of increasing complexity:
  
  
  Example 1: s.split(/-/)
Pretty obvious: this splits the string s anywhere it sees the minus sign symbol -.
'3-2-1'.split(/-/)
// ["3", "2", "1"]
  
  
  Example 2: s.split(/(-)/)
The only difference from the previous example is the enclosing parens in the regex, which creates a capturing group. Here’s the key point of the entire article: If the regular expression contains capturing parentheses around the separator, then each time the separator is matched, the results of the capturing group are spliced into the output array.
'3-2-1'.split(/(-)/)
// ["3", "-", "2", "-", "1"]
  
  
  Example 3: s.split(/(-|\+)/)
This builds off the previous example by adding support for the addition symbol \+. The backslash \ is required to escape the regex. The vertical pipe | acts as an OR statement (match - OR +).
'3-2-1+2+3'.split(/(-|\+)/)
// ["3", "-", "2", "-", "1", "+", "2", "+", "3"]
The Final Boss (tying everything together)
Hopefully, you now have all tools needed to understand .split(/(-|\+|\/|\*|\(|\))/). Hope that made sense! Let me know in the comments if you liked this article, or ping me on twitter!
 

 
    
Top comments (0)