DEV Community

Artem Gurtovoi
Artem Gurtovoi

Posted on

3 2 1 2 2

Distributed Peer Indexing

Problem

Modulo partitioning algorithm taks.id % replicas == index requires knowing the number of task processing instances running in the cluster and the own index of the current instance.

Forces

  1. Static configuration is not an option (due to dynamic scaling / failover).
  2. In a distributed system, there is no concept of a global current time.

Solution

An algorithm that emits (index, replicas) once per interval seconds, using a common Redis key and atomic increment.

Define the following parameters:

  • name: a name of the task processing (e.g. mail-sender)
  • interval: indexing interval that is deliberately greater than the expected clock skew among instances

At the start of each interval in Unix epoch:

  1. Calculate an ordinal number of the current interval: number = ceil(now() / interval)
  2. Compose a key as {name}:{number}
  3. Atomically increment a key in Redis (INCR)
  4. If index is defined (see 5)
    • Get the value of the previous key {name}:{number-1} as replicas
    • If the replicas is defined, emit (index, replicas) algorithm result
  5. Store the response (3) in index.

Safe index transition

If the index or replicas changes, the algorithm consumer must stop consuming new tasks and execute safe index transition, to prevent task duplication or loss.

The transition can be implemented in a manner similar to the described algorithm, using a dedicated {name}:transition key. However, this process is considered outside the scope of this document.

Extension

If the system clocks are too precisely synchronized (skew is less than a round-trip time to Redis), this may result in continuous index transitions.

To mitigate this, the algorithm can be extended with a random delay:

  1. Before starting the algorithm, define a random constant clock skew, significantly smaller than interval: skew = random() * (interval / 2)
  2. Start the algorithm at each interval + skew.

Caveats

  • The first result will become known between interval and interval × 2 seconds.

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (0)

AWS Q Developer image

Your AI Code Assistant

Automate your code reviews. Catch bugs before your coworkers. Fix security issues in your code. Built to handle large projects, Amazon Q Developer works alongside you from idea to production code.

Get started free in your IDE

👋 Kindness is contagious

Explore a trove of insights in this engaging article, celebrated within our welcoming DEV Community. Developers from every background are invited to join and enhance our shared wisdom.

A genuine "thank you" can truly uplift someone’s day. Feel free to express your gratitude in the comments below!

On DEV, our collective exchange of knowledge lightens the road ahead and strengthens our community bonds. Found something valuable here? A small thank you to the author can make a big difference.

Okay