DEV Community

Ankit Kumar
Ankit Kumar

Posted on

Distributed System Configuration Management

This post has based on how can we configure our system so that it shouldn't be restarted to load new configurations. A distributed system's services use configuration to operate.

Let's say we have a distributed system that has multiple services running that process the message stream from different sources based on their configuration.

Alt Text

What is Zookeeper?

ZooKeeper is a distributed coordination service that also helps to manage a large set of hosts. Since managing and coordinating a service especially in a distributed environment is a complicated process, so ZooKeeper solves this problem due to its simple architecture as well as API.

At its best, ZooKeeper allows developers to focus on core application logic without worrying about the distributed nature of the application.

What problem does Zookeeper Solve?

  1. Services needs a restart to load the new configuration or need a new service that syncs the existing configuration

  2. For need to make API Calls to update configuration in each service.

Coordination is of utmost importance in distributed applications. In an application involving multiple services, the components of the system need to work together and coordinate to achieve a result.

That's where Apache Zookeeper is useful. Distributed systems like Apache Hadoop, Apache Kafka, Apache Hive and many more are using zookeeper. All these distributed systems are using a zookeeper as a coordinator between all nodes and store all shared config, state and metadata.

Architecture of Zookeeper:

ZooKeeper follows a simple client-server model where clients are nodes (i.e., machines) that make use of the service, and servers are nodes that provide the service.

Alt Text

Alt Text

Znode - Znodes are the fundamental abstraction provided by Zookeeper to represent a node in a tree-like structure.
A znode can store data in the form of a byte array and it can have child nodes.
Each znode also has an additional data structure called Stat which contains transaction id which created or modified the znode, transaction id which created or modified its children, timestamp, version for the data change, version for the child node change, ACL change, and the owner of the znode.

So that's how zookeeper tree looks like:

Alt Text

Watchers - While reading a particular znode, clients can set watchers. Moreover, for any of the znode (on which client registers) changes, watchers send a notification to the registered client.

Type of Nodes:
Ephemeral Nodes - These Znodes exists as long as the session that created the Znode is active.

Sequence Nodes - When creating this type of node, Zookeeper will add a unique sequence number in the name.

Some of the Key Znode features:

Alt Text

Some of the common Zookeeper Recipes:

  1. Any system that needs a centralised reliable service to manage its configuration across the environment.
  2. Leader election
  3. Naming Service
  4. Queuing the messages
  5. Managing the Notification System

Several Hadoop projects are already using ZooKeeper to coordinate the cluster and provide highly available distributed services.
Apache HBase, which uses ZooKeeper to track the master, the region servers, and the status of data distributed throughout the cluster.

For Zookeeper Installation and References:
Zookeeper

Discussion (0)