Beginning with an honest experience here - The first time I'd ever seen a regex expression being used was to verify valid mail ids in a c# program. I was pretty new to programming back then, and it was a code snippet written by one of my colleagues. Trust me, my first reaction on seeing the regex expression was that something went wrong with the code and We got a bunch of alienated random characters amidst the code.
But now, few springs later, I take a moment to appreciate the way regex expressions have made programming simpler.
In python regex is handled by the re module.
After compiling the regex expression, the function gives a regex expression object, which can be further used for matching and other purposes.
It's best practice to compile a regular expression, if the same is used multiple times across in a single program.
reg_obj = re.compile(reg_expression) result = reg_obj.match(sample_string)
This function scans the sample string and wherever the first match is found, returns a corresponding matching object.
import re def search_string(string): to_search=['good','boy'] for pattern in to_search: if(re.search(pattern,string)): print('Match Found') print(re.search(pattern,string)) else: print(re.search(pattern,string)) print('NO match found') search_string('You\'re a good buoy')
Match Found <re.Match object; span=(9, 13), match='good'> None NO match found
Note: The span gives the start and end index in the sample string where the first match was found.
The difference between match and search is that match searches at the beginning of the sample string and then if match is found, returns a corresponding matching object.
Note: The function of match can be performed with search by using a ^ at the beginning of the regex pattern.
import re def match_reg(string): reg = (r'.*[0-9]$') if(re.match(reg,string,re.IGNORECASE)): print("Match Found!") else: print('No match found!') match_reg('abc07') match_reg('3') match_reg('21abc')
Match Found! Match Found! No match found!
As the name suggests, it is used for matching the whole sample string.
You must have observed by now that both match() and search() return the first occurrence of the matched pattern,however if we wish to get the complete list of all non-overlapping matches, then this is our go to method.
The result will be a list of groups/tuples depending on the number of groups to be matched.
sample = 'email@example.com, firstname.lastname@example.org, email@example.com' emails = re.findall(r'[\w\.-]+@[\w\.-]+', sample) emails for email in emails: print(email)
firstname.lastname@example.org email@example.com firstname.lastname@example.org
It returns an iterator of matching objects.Somewhat similar to re.findall().
It splits the sample string based on the pattern passed. If any maxsplit limit is placed, it will perform the split operation those many number of times and the final string shall contain the remainder part of the sample string which couldn't be split.
re.split(r'\W+', 'Words, words, words.')
['Words', '', 'words', '', 'words', '']
Used as a function to replace matching patterns of a string with a new given pattern.
import re sample_input='this contains abc and abc only' result = re.sub('abc', 'def', sample_input) result
'this contains def and def only'
Top comments (0)