During my journey to learn Data Analysis, I decided to start a project that actually seems real, not just another tutorial project. The project I chose is to analyse Walmart Sales Data using Python for Data Cleaning and PostgreSQL (SQL) for SQL Analysis, taking data from a RAW CSV file and providing Business Insights.
Project Overview:
To take raw (incomplete) sales data and prepare it for storage in a database (where it can be organized), and run SQL queries to obtain key business insights from the prepared data.
I used Panda's Python library to prepare the sales data for storage; then I imported it into PostgreSQL, so I could perform structured queries on it.

Tools Utilized: Pandas, SQLAlchemy, PostgreSQL, SQL, and VSCode; and the source was Kaggle, from which I obtained Walmart's sales data set.

Description of DataSet: The dataset contains detailed transaction-related information for a specific period, including product family, store/branch information, price and quantity that were sold; customer ratings; time the product was purchased.
How I did it: Data Preparation with Python - I utilized the Pandas Library to remove duplicates, alter data types, convert currency amounts into numbers, and create a new column that totals up all sales.
Load the Data into PostgreSQL - I imported the cleaned sales data into PostgreSQL utilising SQLAlchemy. This allowed me to create a similar environment for analyzing sales data to how it would be managed within a business.
SQL Analysis - Using SQL, I was able to answer some essential business inquiries, including which day of the week is the busiest for each store and what type of product sells the most, the categories generating profits, and the patterns of product sales throughout each store's operating hours (using Window Functions and Aggregate Functions in SQL).
Key Takeaways: For cleaning and preparing datasets to enable structured analysis, Pandas is an excellent tool. The power of SQL is its ability to do structured analysis across different datasets. Complex organizational queries can be answered with the help of SQL window functions.
🔗 Complete project source code (GitHub): https://github.com/Akansrodger/Walmart_sales_SQL_PYTHON
My last words on this journey...
This project provided me with the ability to integrate all of the technologies (Python/SQL/Databases) into one consolidated process (workflow) and provided me with another important step in becoming a better data analyst.



Top comments (0)