DEV Community

Robin Moffatt
Robin Moffatt

Posted on • Originally published at rmoff.net on

Should you run Kafka Connect in Distributed or Standalone mode?

Kafka Connect can be deployed in two modes: Standalone or Distributed.

connect

I usually recommend Distributed for several reasons:

  • You can run just a single node of it if you want

  • It can scale

  • It is fault-tolerant

  • It can be run on a single node sandbox or a multi-node production environment

  • It is the same configuration method however you run it

I usually find that Standalone is appropriate when:

  • You need to guarantee locality of task execution, such as picking up a log file from a folder on a specific machine

  • You don’t care about scale or fault-tolerance ;-)

  • You like re-learning how to configure something when you realise that you do care about scale or fault-tolerance X-D

My last snarky point on the list is why even if you’re just playing around with Kafka Connect on a laptop, learning it in Distributed mode means you learn it once, and then you’re all set. If you start with Standalone and its .properties method of passing configuration files to the worker at startup, and then come to use Distributed you have to re-learn how to use the REST interface etc.



Some follow-ups to this:

Top comments (0)