Welcome to the first project in this series. In this project you will rename files with American-Style dates to European-Style dates. You will combine your knowledge of regular expressions and file organization. To follow along with this project, access the project file here.
Before starting let us see if you are prepared for this project. Read the prerequisites below before moving on.
Prerequisites
Make sure you have a good understanding of the following resources before tackling this project:
- File Organization
- Series 1 and 2
Step 1: Understand the project
Assuming they are all organized in this format: MM-DD-YYYY.
What we want to achieve is this format: DD-MM-YYYY.
Observe the first image, what did you notice?
Once you are done, check out my answers.
Observation 1: Only the day and month need to be swapped.
Observation 2: Some of the filenames have a prefix before the dates.
Observation 3: The prefix is separated from the dates with an underscore.
Step 2: Write code
Step 1: Import the Required Modules
The os
, sys
, and re
modules have been used in several other projects, so you should already be familiar with them.
The shutil
module is used to perform operations such as copying, moving, and renaming files.
Step 2: Create a function with one parameter: root_dir
.
Step 3: Create a regex pattern.
Comments have been used to explain it. But overall, the pattern matches American-Style dates with years in the 19's and 20's.
re.compile
is used to compile a regular expression pattern into a regex object, allowing you to reuse the pattern for matching operations.
Question: Can you identify how many groups are in this pattern? Remember that a group is separated by () parenthesis. Drop your answers in the comments section.
Step 4: Test the pattern.
To test the pattern, visit this website, hover over each pattern and it interprets it as a character, metacharacter and so on. On the right side of the website, you will find a cheat sheet for reference. Paste some filenames in both correct and incorrect patterns to see if they match.
Step 5: Loop through all files in the root_dir
.
Step 6: Search each file for pattern match.
Do you remember what .search
does? It finds the first occurrence of a pattern in a string and returns a match object, while .findall
returns all occurrences of the pattern as a list of strings.
Print out the match object to see what it returns, then comment it out and proceed.
Step 7: Check if a match is found and if true, it returns a date component.
match.groups()
returns a tuple. You can unpack the tuple by assigning its elements to individual variables, such as month
, day
, year
, and _
. The underscore _
is often used as a placeholder for values you want to ignore.
Step 8: Rename the filename to the European-Style date: DD-MM-YYYY.
- Give it a variable name;
new_filename
. - Use f-string to give it the desired name: This line uses an f-string to construct the
new_filename
. Here's a breakdown of how it works: - filename[:match.start()]: This part takes the portion of the original filename from the beginning up to the start of the match. It effectively includes everything before the matched text.
- {day}-{month}-{year}: This part adds the date components (day, month, and year) extracted from the match. They are separated by hyphens to create a date format.
- filename[match.end():]: This part takes the portion of the original filename from the end of the match to the end of the string. It effectively includes everything after the matched text.
- Print the filename to check that it works correctly: You can use
print(new_filename)
to verify that the new filename is generated as expected. Once you confirm that it's working correctly, you can comment out the print statement.
In summary, this code creates a new filename by replacing the matched portion of the original filename with the extracted date components in a specific format.
Step 9: Get the old and new file_path and store them in a variable.
Step 10: Use shutil.move()
to rename it.
Before running the code, it's a good practice to comment out the shutil.move()
line. After verifying that the print statement works correctly and provides the expected file paths, you can uncomment it to actually rename the files.
Step 11: The last part is checking if a root directory is specified by calculating the length. Remember that sys.argv
returns a list, so you can check the length of this list to see if a root directory is specified.
import os, shutil, re, sys
# let us assume all files are in the American Style dates and we want files in the 19's and 20's
def rename_files(root_dir):
# Create a regex pattern for American-style dates (MM-DD-YYYY)
date_pattern = re.compile(
r'''
(0[1-9]|1[0-2])- # matches 01-09 or 10-12 for month with a '-' character
(0[1-9]|[12]\d|3[01])- # matches 01-31 for days
((19|20)\d{2}) # matches the 19's and 20's for year
''', re.VERBOSE
)
# Loop through all files in the current directory
for filename in os.listdir(root_dir):
# Check if the filename contains a date in American style
match = date_pattern.search(filename)
if match: # not None
# Extract the date components
month, day, year, _ = match.groups() # returns a tuple
# Rename the file with European-style date (DD-MM-YYYY)
new_filename = f'{filename[:match.start()]}{day}-{month}-{year}{filename[match.end():]}'
print(new_filename)
old_filepath = os.path.join(root_dir, filename)
new_filepath = os.path.join(root_dir, new_filename)
# Rename the file using shutil.move()
shutil.move(old_filepath, new_filepath)
print(f'Renamed: {filename} -> {new_filename}')
print()
if __name__ == '__main__':
if len(sys.argv) != 2:
print('Usage: python main.py [root_dir]')
exit(1)
try:
root_dir = sys.argv[1]
rename_files(root_dir)
except FileNotFoundError as e:
print(e)
This is the result on the terminal.
Exercise 1
Use the os module to rename files. Give it a try!
Exercise 2
Complete the exercises listed under "Ideas for Similar Program." You can find the questions here. Once you have completed them, share the link to your solution in the comments section.
Top comments (0)