DEV Community

Elliot Wong for Oursky

Posted on • Originally published at code.oursky.com

Regex for Date, Time and Currency, with Code Examples

In this article, regular expressions of currency (e.g., US$100, £0.12, or HK$54), time, and date are listed out for quick copy and paste. They’re all battle-tested. While each regex comes with limitations, we have notes addressing that along with customization tips.

We do hope you check out the interactive code snippets to get a better idea on how the regexes work!

Currency Regex

Note that currency signs apart from “$” will be dropped. The currency value will still gets matched, i.e., pound sterling sign £ in the first item of the test array.

import re

test = [
  "$9876 £112.00",
  "asdf$1234",
  "$12.00 14",
  "$3000000000000",
  "$00000000000001",
  "$00000000000000",
  "asdf",
  "one hundred forty two dollars"
]

regex = re.compile(
    r'\$?(?:(?:[1-9][0-9]{0,2})(?:,[0-9]{3})+|[1-9][0-9]*|0)(?:[\.,][0-9][0-9]?)?(?![0-9]+)'
)

print(sum([regex.findall(x) for x in test],[]))
Enter fullscreen mode Exit fullscreen mode

Results should be:

['$9876', '112.00', '$1234', '$12.00', '14', '$3000000000000', '1', '0']
Enter fullscreen mode Exit fullscreen mode

Interactive code snippets available here

Time Regex

import re
test = [
  "00:00:00", "23:59:59",
  "00 00 00", "23 59 59",
  "00.00.00", "23.59.59",
  "00:00.00", "23.59:59",
  "9:00pm", "9:00am", "10:00:00 am", 
  "13:00:12 am", "13 pm" #won't be considered as valid time
]
regex = re.compile(
    r'(?=((?: |^)[0-2]?\d[:. ]?[0-5]\d(?:[:. ]?[0-5]\d)?(?:[ ]?[ap]\.?m?\.?)?(?: |$)))'
)
print(sum([regex.findall(x) for x in test],[]))
Enter fullscreen mode Exit fullscreen mode

Results should be:

['00:00:00', '23:59:59', '00 00 00', ' 00 00', '23 59 59', '00.00.00', '23.59.59', '00:00.00', '23.59:59', '9:00pm', '9:00am', '10:00:00 am', '13:00:12 am']
Enter fullscreen mode Exit fullscreen mode

Interactive code snippets available here

Regex Date with months in English (YYYY/MMMM/dd)

Note that currency signs apart from “$” will be dropped. The currency value will still gets matched, i.e., pound sterling sign £ in the first item of the test array.

import re

test = [
  "2020-jan-1", 
  "2012-jan-12",
  "1920-feb-22",
  # space isn't a valid delimiter here, you can add it in the regex though
  "2020 mar 1",
  # only 19** and 20** are considered valid here, add year prefix accordingly, or extract with the last two year digits only
  "1840-jun-12",
  # Must follow the format YYYY-MMMM-dd
  "2020-01-01"
]

regex = re.compile(
    '(?=((?:(?:[0][1-9]|[1-2][0-9]|3[0-1]|[1-9])[/\-,.]?(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)[a-z]*[/\-,.]?(?:19|20)?\d{2}(?!\:)|'
    '(?:19|20)?\d{2}[/\-,.]?(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)[a-z]*[/\-,.]?(?:[0][1-9]|[1-2][0-9]|3[0-1]|[1-9])|'
    '(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)[a-z]*[/\-,.]?(?:[0][1-9]|[1-2][0-9]|3[0-1]|[1-9])[/\-,.]?(?:19|20)\d{2}(?!\:)|'
    '(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)[a-z]*[/\-,.]?(?:[0][1-9]|[1-2][0-9]|3[0-1]|[1-9])[/\-,.]?\d{2})))'
)

print(sum([regex.findall(x) for x in test],[]))
Enter fullscreen mode Exit fullscreen mode

Results should be:

['2020-jan-1', '20-jan-1', '2012-jan-12', '12-jan-12', '2-jan-12', '1920-feb-22', '20-feb-22', '40-jun-12']
Enter fullscreen mode Exit fullscreen mode

Interactive code snippets available here

Check out the Original Post for More Details

This is an abstract from our original blog post, which provides more regexes and explanations. In that article, more accurate ways to extract data are also discussed, with solutions proposed. It'd be nice if you can check it out and share your thoughts. Happy coding, cheers!

Top comments (0)