Scenario
“Your company needs to learn about the files located on various machines. You have been asked to build a script that extracts information such as the name and size about the files in the current working directory and stores it in a list of dictionaries.” -LUIT, Python2
Automation scripting is a game-changer and the LUIT-Python2 GitHub repository provides a robust solution. This repository offers a Python-based tool that simplifies tasks like network provisioning, application deployment, and system configuration.
Why Automate Infrastructure Management?
Efficiency: By automating repetitive tasks, you save time and reduce the risk of human error.
Consistency: Each infrastructure deployment is identical, ensuring that “it works on my machine” never becomes an issue.
Scalability: Automated scripts can scale effortlessly, provisioning multiple environments with just a few commands.
Maintainability: Once an automated script is written, it can be reused, updated, and version-controlled.
Getting Started
Before diving into the repository, make sure you have Python 2.7 (or compatible) installed on your machine. You will also need a basic understanding of Linux-based systems, as many of the scripts are tailored for that environment.
Clone the Repository: Start by cloning the LUIT-Python2 repository from GitHub:
git clone https://github.com/Judewakim/LUIT-Python2.git
Install Dependencies: The repository uses a set of Python packages to handle various tasks. You can install them using pip:
cd LUIT-Python2
pip install -r requirements.txt
Writing Python
To break down the essential components of the data_extraction.py file, lets start with the imports. This code requires to import information about the operating system and the datetime.
import os
import time
Next, we begin defining the first function of the code. This function will collect all the information that we will need. Remember, the goal of this code is to “build a script that extracts information such as the name and size about the files in the current working directory” so we will collect all the information about the working directory (or whatever directory is specified) and later we will present this information similar to the ls -al command in the Linux command line. That function will look like this:
def get_files_info(path='.'): # Function that collects file details, defaulting to current directory
files_info = [] # List to store file details
for root, _, files in os.walk(path): # Recursively traverse directories and files
for filename in files:
file_path = os.path.join(root, filename) # Construct the full file path
file_stat = os.stat(file_path) # Get file details/statistics
files_info.append({ # Store file details in a dictionary
'name': filename, # File name
'path': file_path, # Full file path
'size_bytes': file_stat.st_size, # File size in bytes
'last_modified': time.ctime(file_stat.st_mtime), # Last modified time
'permissions': oct(file_stat.st_mode)[-3:], # File permissions in octal format (last 3 digits)
})
return files_info # Return the list of file details
Now, we have a function that collects the file name, file path, file size, last modification time, and file permissions of each file in the path. The next thing to do is create another function that will display all this collected data in the way we want it. That second function will look like this:
def print_ll_view(files_info): # Function to print file details in 'll' view format
for file in files_info: # Loop through the list of file dictionaries
print(f"{file['permissions']} {file['size_bytes']} {file['last_modified']} {file['path']}") # Print file details
Lastly, we will call this functions and add a bit of user interaction to make the program more streamlined for the user. That last bit of code is set to only run when the Python is run directly and cannot be called from another file. This is what that looks like:
if __name__ == "__main__": # Run the script only if executed directly
path = input("Enter the directory path (press Enter to use current directory): ") or "." # Prompt user for path, default to current directory
files_data = get_files_info(path) # Get file details for the specified path
print("\nLinux CLI 'll' View:") # Print header
print_ll_view(files_data) # Display file details
Running the Program
The program is created and ready to be used. Navigate to the location where the program is stored. If you cloned it from my Github repository the location should be LUIT-Python2 . Once there, you can run the program use the command python .\data_extraction.py
.
At this point, you have a Python program that will display the files from whatever location you specify and will display those files in Linux ls format for easy readability. This program can be repeated with whatever file path you need.
You can modify this program to fit your company’s needs by cloning it locally or forking it on Github.
This entire project is available on GitHub
Originally published on Medium
Find me on Linkedin
Top comments (0)