It seems there is quite a hype around Kafka at the moment - at least if you take the amount of conference talks as an indicator.
I don‘t have experiences with it yet, but I saw one speaker who claimed it could not scale by design.
What are your experiences with it? Is it able to scale to thousands of users and millions of messages easily?
Top comments (7)
Well I'm surprised to hear "could not scale by design". We are in the works of implementing it for our enterprise and have seen the exact opposite. Straight from the Apache website: "The disk structures Kafka uses scale well—Kafka will perform the same whether you have 50 KB or 50 TB of persistent data on the server." The logs are written to disk in conjunction of using RocksDB for certain cached data storage.
Kafka partitions use only one thread. To me the best situation where Kafka starts to gain some interest is distribution (horizontal scale). But what about vertical scaling ? The post questioning was supposed to bring up a serious debate.
Yes, I was surprised as well.
I ran a 25 node cluster at my last gig and I'd say that you can scale Kafka really well. We were some where in the billion(s) of events per day with plenty of overhead. I'd love to see how LinkedIn scales Kafka though 🤩
Kubernetes did all the work not Kafka
I would think if there was any criticism it would be the ops effort. Then secondly, the organization limitations, such as a fairly low number of topics (10s of thousands) because of zookeeper constraints.
But I haven't run it in production... this is just from my research. If I get to the necessary scale, I would certainly consider it for microservice communication (as opposed to request-reply) or stream processing.