Manually managing an e-commerce catalog with 80,000 products, 11 different suppliers, and 5 languages is not just inefficient — it's a direct risk to your business margins. Each supplier sends price lists in different formats (.csv, .xlsx) with inconsistently named columns, making it practically impossible to keep prices, wholesale costs, and stock availability updated by hand.
To solve this, I built a Python and Pandas workflow based on three technical pillars: backup management, dynamic price optimization, and automatic product onboarding.
1. Data Architecture: Safety and Golden Backup
Catalog integrity starts with two strategic exports from Matrixify:
- Safety Backup (Daily): Contains only the essential columns (barcode, ID, handle, SKU, price, and quantity). This is the file processed daily for fast syncs.
- Golden Backup (Weekly): The complete "source of truth" for the entire site, imported into a SQL database for deeper analysis and duplicate management.
Each script cross-references the supplier file with the corresponding Safety Backup, ensuring that only fresh, relevant data is touched — without ever overwriting the full catalog unnecessarily.
2. Dynamic Pricing and Margin Algorithm
The selling price calculation is not linear. It follows brand-specific business rules, handled by a single method — apply_discount_or_margin — that covers two main scenarios:
- Standard Logic: The script starts from the wholesale cost, applies a markup, and verifies that the minimum guaranteed margin is respected.
- Imposed Price Logic: For brands that define their own retail price, the script validates the residual margin and checks that the applied discount does not exceed contractual thresholds.
def apply_discount_or_margin(
self,
discount: Optional[float], # None for brands with imposed prices
markup: float = 0.5,
min_margin: float = 40, # Minimum guaranteed margin in €
max_margin: Optional[float] = None,
vat: float = 1.22,
extra_discount: Optional[float] = None,
max_discount: Optional[float] = None
) -> pd.DataFrame:
3. Surgical Operations with Pandas
Scripts are designed to intervene only where needed, avoiding unnecessary catalog overwrites.
-
"Surgical" SEO Update: Instead of reloading the entire database, the script identifies only products with missing meta-tags. This is done via a
merge(how='left')against the backup; only incomplete records are filtered and updated, keeping Shopify imports fast and lightweight.
mask = (
backup_df["Body HTML"].isna() |
backup_df["SEO Title"].isna() |
backup_df["SEO Description"].isna()
)
to_update = backup_df[mask]
-
New Product Detection: The system compares supplier barcodes against the Safety Backup. Using a
merge(how='right')and filtering forright_onlyrows in the_mergecolumn, the script isolates products not yet in the system and automatically populates them with prices, quantities, and SEO-ready descriptions.
new_products = supplier_df[
~supplier_df["Barcode"].isin(backup_df["Barcode"])
]
4. Orchestration: Crontab, SFTP, and Matrixify
The scripts don't run manually — the entire workflow is fully automated through three components working in sync:
- Crontab schedules and triggers each Python script on my NAS server at defined intervals — daily for price and quantity updates, with separate jobs for SEO and new product detection.
- The scripts output the processed CSV files directly to a dedicated SFTP server.
- Matrixify is connected to the SFTP and runs on a precisely configured schedule, automatically picking up the files and importing them into Shopify.
The result is a zero-touch pipeline: from the supplier's price list landing on the server to the products being updated on Shopify, no manual intervention is needed at any stage.
[Crontab on NAS]
↓ triggers Python scripts
[Processed CSV files]
↓ deposited to SFTP
[Matrixify scheduled import]
↓ auto-imports into Shopify
[Live catalog updated]
Results
This approach reduces hours of manual work to a few minutes of execution, eliminating human errors on markups and keeping the catalog continuously updated and SEO-optimized.
Do you manage an e-commerce catalog with Python, or have you built a similar workflow? Share your approach in the comments — I'd love to compare notes.

Top comments (0)