DEV Community

suvhotta
suvhotta

Posted on

Python - Regex Functions

Beginning with an honest experience here - The first time I'd ever seen a regex expression being used was to verify valid mail ids in a c# program. I was pretty new to programming back then, and it was a code snippet written by one of my colleagues. Trust me, my first reaction on seeing the regex expression was that something went wrong with the code and We got a bunch of alienated random characters amidst the code.

But now, few springs later, I take a moment to appreciate the way regex expressions have made programming simpler.

In python regex is handled by the re module.

re.compile()

After compiling the regex expression, the function gives a regex expression object, which can be further used for matching and other purposes.

It's best practice to compile a regular expression, if the same is used multiple times across in a single program.

reg_obj = re.compile(reg_expression)
result = reg_obj.match(sample_string)
Enter fullscreen mode Exit fullscreen mode

re.search()

This function scans the sample string and wherever the first match is found, returns a corresponding matching object.

import re
def search_string(string):
    to_search=['good','boy']
    for pattern in to_search:
        if(re.search(pattern,string)):
            print('Match Found')
            print(re.search(pattern,string))
        else:
            print(re.search(pattern,string))
            print('NO match found')

search_string('You\'re a good buoy')
Enter fullscreen mode Exit fullscreen mode

Output:

Match Found
<re.Match object; span=(9, 13), match='good'>
None
NO match found
Enter fullscreen mode Exit fullscreen mode

Note: The span gives the start and end index in the sample string where the first match was found.

re.match()

The difference between match and search is that match searches at the beginning of the sample string and then if match is found, returns a corresponding matching object.

Note: The function of match can be performed with search by using a ^ at the beginning of the regex pattern.

import re
def match_reg(string):
    reg = (r'.*[0-9]$')
    if(re.match(reg,string,re.IGNORECASE)):
        print("Match Found!")
    else:
        print('No match found!')

match_reg('abc07')
match_reg('3')
match_reg('21abc')
Enter fullscreen mode Exit fullscreen mode

Output:

Match Found!
Match Found!
No match found!
Enter fullscreen mode Exit fullscreen mode

re.fullmatch()

As the name suggests, it is used for matching the whole sample string.

re.findall()

You must have observed by now that both match() and search() return the first occurrence of the matched pattern,however if we wish to get the complete list of all non-overlapping matches, then this is our go to method.
The result will be a list of groups/tuples depending on the number of groups to be matched.

sample = 'abc23@gmail.com, randomfella99@hotmail.com, suv04@yahoomail.com'
emails = re.findall(r'[\w\.-]+@[\w\.-]+', sample)
emails
for email in emails:
    print(email)
Enter fullscreen mode Exit fullscreen mode

Output:

abc23@gmail.com
randomfella99@hotmail.com
suv04@yahoomail.com
Enter fullscreen mode Exit fullscreen mode

re.finditer()

It returns an iterator of matching objects.Somewhat similar to re.findall().

re.split()

It splits the sample string based on the pattern passed. If any maxsplit limit is placed, it will perform the split operation those many number of times and the final string shall contain the remainder part of the sample string which couldn't be split.

re.split(r'\W+', 'Words, words, words.')
Enter fullscreen mode Exit fullscreen mode

Output:

['Words', '', 'words', '', 'words', '']
Enter fullscreen mode Exit fullscreen mode

re.sub()

Used as a function to replace matching patterns of a string with a new given pattern.

import re
sample_input='this contains abc and abc only'
result = re.sub('abc',  'def', sample_input)
result
Enter fullscreen mode Exit fullscreen mode

Output:

'this contains def and def only'
Enter fullscreen mode Exit fullscreen mode

Hope this was helpful!

Latest comments (0)