I recently completed an end-to-end Customer Shopping Behavior Analysis project using a dataset of 3,900 transactions. The objective was to uncover actionable insights into spending patterns, customer segments, product preferences, and subscription behavior to support strategic business decisions.
Project Approach
- Data Preparation & Cleaning (Python)
- Loaded and explored the dataset using pandas.
- Handled 37 missing values in the Review Rating column by imputing with category-specific medians.
Standardized column names to snake_case, performed consistency checks, and dropped redundant features (e.g., promo_code_used was identical to discount_applied).
Created new features: age_group and purchase_frequency_days (converted textual frequency into numeric days).
Loaded the cleaned data into PostgreSQL for efficient querying.
- SQL Analysis β Answering Key Business Questions I developed structured SQL queries in PostgreSQL to deliver clear business insights, including:
- Revenue split by gender (Males: ~$157K vs. Females: ~$75K).
- High-spending customers who used discounts.
- Top 5-rated products per ratings.
- Shipping type performance (Express vs. Standard).
- Subscribers vs. non-subscribers comparison.
- Discount dependency by product.
- Customer segmentation (New, Returning, Loyal) based on purchase history.
- Revenue contribution by age group.
- Visualization & Storytelling Built an interactive Power BI dashboard featuring key KPIs, revenue breakdowns by category/age/subscription, sales trends, and filters for dynamic exploration.
Key Business Recommendations
Strengthen subscription programs with exclusive benefits to convert more loyal customers.
Implement targeted loyalty programs to grow the βLoyalβ segment (currently 80% of customers).
Review discount strategy on high-dependency items (e.g., Hats, Sneakers) to protect margins.
Focus marketing on high-revenue age groups and customers preferring Express shipping.











Top comments (0)