
💡 The Motivation
Let’s be real. Nobody likes waiting in queues, especially not the ones where money’s on the line and fraudsters are ...
For further actions, you may consider blocking this person and/or reporting abuse
Amazing work creating this efficient pipeline with clear implementation details! What inspired the choice of tools and techniques used here?
Hey Nevo, I'm glad you liked my implementation!
The choice of tools was shaped by both hands-on experience and practical constraints. I used kNN instead of Random forest based on a comparative study of model performance for this particular dataset which I found on kaggle, although random forest stands out for problem statements regarding fraud and anomaly detection.
For the pipeline I specifically used kafka because of the theoretical familiarity I had about it; high throughput. Kafka's streaming capabilities addresses real world credit card transactions scenarios where thousands of transactions occur each second at varying time differences. Although after a thorough literature review of multiple research papers, Apache Flink stood out but I had already started my implementation and my faculty guide also supported me using Kafka for because of it's wider adoption in the industry.
I chose Docker mainly for consistency and ease of deployment. For a project like this that mimics production behavior, using containers felt closer to how things would run in the real world.
Great blog
Thank you Vedangi!
Love the little jokes if need help hit me up! And nice project for seminar
Thank you Arya!
Great work Ahad!
sensational
Loved how you combined technical depth with humour!! Great Project!
Thank you Rachit!
honestly, an amazing blog, one of the best reads in a long time.
great work Ahad!
Thank you Vishvesh!