Hey devs,
I'm currently working on an AI-powered movie streaming platform,named castle app, and I'm looking for advice on optimizing personalized recommendations and adaptive streaming for a seamless user experience.
For recommendations, we're using a combination of:
Collaborative filtering (matrix factorization via SVD, autoencoders)
Content-based filtering (TF-IDF, word embeddings)
Deep learning models (transformers for sequence-based recommendations)
The challenge is balancing real-time recommendations with scalability. We’re running inference with TensorFlow Serving, but we’re considering moving to Triton Inference Server for better GPU utilization. Has anyone tested both at scale? Any suggestions for improving inference speed in a high-traffic environment?
For adaptive streaming, we currently use:
H.265/HEVC encoding for efficient compression
FFmpeg with neural network optimizations
Reinforcement learning for dynamic bitrate selection
We’re exploring AI-driven real-time bitrate adjustments based on network conditions. Would it be better to integrate Deep Q-Learning (DQL) or a simpler heuristic-based model for balancing video quality and latency?
Any insights, benchmarks, or real-world experience would be appreciated! Looking forward to hearing your thoughts.
Top comments (0)