Day 9 of Phase 2: AI System Building focused on implementing a collaborative filtering Recommendation System using ALS.
User interactions were mapped into rating values (purchase = 3, cart = 2, view = 1) to simulate implicit feedback strength. An ALS model was trained on a controlled subset of users to prevent memory overflow in a shared/serverless environment.
Initial attempts using StringIndexer caused model size overflow due to high cardinality. Numeric casting of user and product IDs resolved this issue. Training on the full dataset resulted in heap memory errors, so user sampling and product pool limitation were applied to stabilize computation.
Because Unity Catalog restricts nested array rendering, manual candidate scoring and window-based ranking were implemented to generate Top-5 recommendations per user. Historical interactions were removed to ensure novelty in recommendations, which reduced counts for some users due to limited candidate coverage.
Throughout implementation, ChatGPT supported architectural decisions, memory optimization, and troubleshooting within Databricks.









Top comments (0)