DEV Community

Cover image for Twitter snowflake approach is cool
Atakan Demircioğlu
Atakan Demircioğlu

Posted on

Twitter snowflake approach is cool

Twitter snowflake approach is cool

I was researching a solution to generate unique IDs and I liked the Twitter snowflake approach. These are my notes about this approach.

What is Twitter’s snowflake approach?

It is a solution to generate unique IDs in distributed systems. Twitter uses this approach in Tweets, DM’s, Lists and etc.

  • IDs are unique and sortable
  • IDs include time. (ordered by date)
  • IDs fit 64-bit unsigned integers.
  • Only numerical values.

Sign bit (1 bit): Reserved bit (It is always 0). This can be reserver for future requests. It can be potentially used to make the overall number positive.

Timestamp(41 bit): Epoch timestamp in a millisecond (Snowflake’s default epoch is equal to Nov 04, 2010, 01:42:54 UTC)

Machine ID(10-bit): accommodates 1024 machines

Sequence number(12-bit): It is a local counter per each machine and increments by 1. The number reset to 0 in every millisecond. Theoretically, a machine can support a max of 4096 (2¹²) new IDs per second.

Advantages & Disadvantages of the Twitter Snowflake Approach

  • It is 64-bit long, it is half the size of UUIDs
  • Scalable (it can accommodate 1024 machines)
  • Highly available (Each machine can generate 4096 unique IDs each millisecond)
  • Some of the UUID versions do not include a timestamp. In this case, Twitter Snowflake has a sortable advantage.
  • Design requires Zookeeper (disadvantage)
  • The generated IDs are not random like UUIDs. Future IDs can predictable.
  • The maximum timestamp that can be represented in 41 bits is (~ 69 years). Need a solution after this :)

Usage Notes

  • Discord uses snowflakes, with their epoch set to the first second of the year 2015.
  • Instagram uses a modified version of the format, with 41 bits for a timestamp, 13 bits for a shard ID, and 10 bits for a sequence number.
  • Mastodon’s modified format has 48 bits for a millisecond-level timestamp, it uses the UNIX epoch. The remaining 16 bits are for sequence data.

References

AWS GenAI LIVE image

How is generative AI increasing efficiency?

Join AWS GenAI LIVE! to find out how gen AI is reshaping productivity, streamlining processes, and driving innovation.

Learn more

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay