DEV Community

Free Python Code
Free Python Code

Posted on

Detect File Changes in Real-Time with Python File Monitoring and Hash Comparison

Hi 🙂🖐

In this post, I will show you how to Detect File Changes in Real-Time with Python File Monitoring and Hash Comparison

from time import sleep
from hashlib import md5
from os import listdir

src_dir = 'myFiles'
my_files = listdir(src_dir)

# create hash for files
def get_file_hash(filename):
    return md5(open(filename, 'rb').read()).hexdigest()

file_data = {}

def load_files():
    for filename in my_files:
        file_hash = get_file_hash(f'{src_dir}/{filename}')
        file_data[filename] = file_hash


load_files()

def check_changes():
    for filename in my_files:
        file_hash = get_file_hash(f'{src_dir}/{filename}')
        if file_data[filename] != file_hash:
            print(f'change detected in file {filename}')

    sleep(1)
    check_changes()

sleep(1)
check_changes()
Enter fullscreen mode Exit fullscreen mode

This code is a Python script designed to monitor changes in files within a specified directory. Let's break it down step by step:

1. from time import sleep: This imports the sleep function from the time module. It is used to introduce a pause in the execution of the script.

2. from hashlib import md5: This imports the md5 hashing algorithm from the hashlib module. MD5 is used here to generate a hash of the file content.

3. from os import listdir: This imports the listdir function from the os module. It is used to list all files in a directory.

**4. src_dir = 'myFiles': **This specifies the directory (myFiles) where the files to be monitored are located.

5. my_files = listdir(src_dir): This lists all the files in the directory specified by src_dir and stores them in the my_files list.

6. def get_file_hash(filename): This is a function that takes a filename as input, reads the file in binary mode, calculates its MD5 hash, and returns the hexadecimal representation of the hash.

7. file_data = {}: This initializes an empty dictionary file_data where the filename and its corresponding hash will be stored.

8. def load_files(): This function iterates through all files in my_files, calculates the hash of each file using get_file_hash, and stores the filename and its hash in the file_data dictionary.

**9. load_files(): **This call to load_files initializes the file_data dictionary with the initial hashes of all files in the directory.

10. def check_changes(): This function is responsible for monitoring changes in the files. It iterates through all files in my_files, recalculates the hash of each file, and compares it with the hash stored in file_data. If a change is detected, it prints a message indicating the change.

11. Inside check_changes, there's a recursive call to itself wrapped within a sleep(1) statement, which causes the script to pause for 1 second before recursively calling check_changes again. This results in the script continuously checking for changes in the files.

12. The script ends with an initial sleep of 1 second followed by a call to check_changes() to start the monitoring process.

In summary, this script continuously monitors the specified directory for changes in files by calculating their MD5 hashes and comparing them with previously stored hashes. If a change is detected, it prints a message indicating the change.

Top comments (0)