rhymes

Posted on Sep 7, 2019 • Edited on Jan 23, 2020

What would you use as a sortable, globally unique, ID?

#distributedsystems #discuss

The objective is to have an ID that has the following two inherent properties:

globally unique: if two different nodes generate an ID, the possibilities for them to conflict are practically zero (hence no autoincrementing IDs)
sortable: the IDs, even those generated by two different nodes, can be sorted or partially sorted making it far easier to order events (hence no UUIDs)

What would you use?

Top comments (39)

Bertil Muth • Sep 7 '19

Why not use two fields, a UUID and a timestamp?

rhymes • Sep 7 '19

I like the lateral thinking going on in your question :)

That's the status quo though, which works decently but it still requires two fields.

What if you could look at two IDs and roughly know which one comes first? What if the IDs carried with them both info? The timing and the uniqueness...

Bertil Muth • Sep 7 '19 • Edited

Hehe. Then what about the brute force approach, concatenation of timestamp and UUID, using an intermediate character guaranteed not to occur in one of them?

Edit: lexical and chronological sorting of ISO8601 date times is identical. So if you start with that, then the UUID, does that solve the problem?

rhymes • Sep 7 '19

Then what about the brute force approach, concatenation of timestamp and UUID, using an intermediate character guaranteed not to occur in one of them?

Something like this could work :)

You're really close to the solutions I found searching on the internet, take a look at github.com/segmentio/ksuid for example

Dian Fay • Sep 7 '19

You would have to perform more complex comparisons to see if ids match, since you lose the binary representation. v1 and v1mc UUIDs are sort of ordered but not necessarily between multiple nodes, and they cycle fairly rapidly.

Jimmy Nilsson did an interesting experiment back in the day with adding the timestamp inside the UUID itself.

rhymes • Sep 7 '19

First: it's a little bit funny to read an almost 20 years old article that talks about RAM in order of megabytes :D

I love how hacky his solution was, though it's a little bit weird that the timestamp is at the end, but that's just the order of the two casts.

This part was funny as well:

A DATETIME has the "precision" of 1/300th of a second, and it's no problem for SQL Server to generate five GUIDs in 3ms with this algorithm

5 IDs in 3 ms. Oh, the year 2000 :D

Thanks for linking this, it was such a cool read!

As an aside I wonder how slower the uuid type in PostgreSQL is in comparison to integer primary keys, today.

Bertil Muth • Sep 7 '19

That’s actually what I meant: concatenation of a timestamp with a UUID to generate a new UUID. Or maybe I just don’t understand your comment :-)

Dian Fay • Sep 7 '19

When you said "intermediate character" I took it to mean a string 1567876931|11aca736-c9a9-4266-a2f6-13bd07202c09 or some such.

Bertil Muth • Sep 7 '19

That’s what I meant. I wasn’t aware of the restrictions placed on UUIDds until now that I looked it up. I just thought it meant “universally unique”, which would be the case with my proposal, I think. So thanks. One more thing I learned.

Dustin King • Sep 8 '19 • Edited

There are probably better answers, but a concatenation of:

timestamp (yyyy-mm-ddThh:mm:ssZ)
sequence number (of item generated by same node at same timestamp, left padded with zeroes so the numbers are sortable asciibettically)
node ID (also zero-padded if numeric)

Maybe include milliseconds or more depending on how fast and frequently items are generated. Also maybe node ID before sequence number depending on whether you'd rather have ones with the same node ID together, or have the same sequence number together for a given timestamp.

Something like:

2019-09-08T22:17:41Z;00003;grayarea

Edit to add: Maybe use _ to separate the fields to make it more distinct:

2019-09-08T22:17:41Z_00003_grayarea

rhymes • Sep 9 '19

Thanks Dustin, the idea of using a timestamp is pretty popular and I agree.

The problem with using a sequence generator though is predictability and also collision, which to be avoided would require coordination which is not feasible. That's why you shouldn't use things like custom random functions for secrets :D

The node ID (whatever we want to use) is also another predictable part. We don't want your node to generate IDs posing as my node, for example.

It's a really tricky thing to come up with something that's random, secure and portable.

UUIDv1 had the timestamp and the node ID (the MAC address) but it wasn't secure because MAC addresses could be guessed and spoofed.

All the implementations I saw around have a completely random part attached to the encoded timestamp, which I guess is the best way to avoid those issues.

Some have suggested ULID directly or variations on the same tune (encoded timestamp plus completely random string) for those reasons.

Dustin King • Sep 9 '19

I guess I was assuming the nodes were in a relatively trusted environment. Like, maybe distributed geographically but communicating on a secure network, where nodes are owned by the same entity. Also being able to rely on some authorities for unique node IDs (and time synchronization).

If spoofing is an issue, one could require that each ID is cryptographically signed using the node's private key. However, appending a signature might make the ID a lot longer, I'm not sure.

rhymes • Sep 10 '19

I guess I was assuming the nodes were in a relatively trusted environment.
Also being able to rely on some authorities for unique node IDs (and time synchronization).

This requires synchronization though, which is another problem in itself. It means having to handle a sort of "god machine" that releases node IDs everytime a node comes online, which still is tricky for serverless environments, unless you are considering every single process of the app a separate node. Keeping in mind that the authority could be offline or unreachable or too slow and yet another machine to handle, monitor, keep updated and secure and so on...

Unless tracing back the originating node from the ID is paramount (which could have security issues onto itself in case an ID leaks outside the trusted area), I believe letting go of the whole idea of embedding a node identifier in the final ID is a way to sidestep all of these things

Dustin King • Sep 10 '19

I think I would need to know what this is for to go further. The simplest thing satisfying your original criteria would probably be a timestamp down to the nanosecond + random per-item hex string to avoid collisions.

If node identity needs to be kept secret, then the source of random numbers could be a potential point of deanonymization, so a CSPRNG might need to be used.

Timestamps have some potential issues, but depending on the application, they may or may not matter.

rhymes • Sep 11 '19

The simplest thing satisfying your original criteria would probably be a timestamp down to the nanosecond + random per-item hex string to avoid collisions.

This is what we ended up choosing, though it's down to the millisecond. We're using ULID as the format. It's a hash of a timestamp and a random string

then the source of random numbers could be a potential point of deanonymization, so a CSPRNG might need to be used.

yeah, exactly. It uses the secure generator for the random part

Thanks for the exchange!

Dustin King • Sep 11 '19

Likewise :)

Kasey Speakman • Sep 9 '19

I had the idea that I wanted to construct a scalable distributed event store on top of AWS and ran into similar questions (ordering in a distributed system). Dynamo for distributed event storage and S3 for initial replay storage (due to high cost of Dynamo replays).

I started thinking about how to place the events in S3 in such a way that they could be replayed in some semblance of order. Each event stream is totally ordered by itself by StreamID + Version, but there is no order indicated by across streams. Dynamo doesn't support any such order because streams are distributed across different servers.

I looked at using the event envelope to construct the S3 file name which would be in lexicographical order. That way listeners would ask S3 for a file listing by name and they would come back in order. That would include a node-specific timestamp and the node id in the file name. The timestamp would be the default "ordering", and the node id would be used as an arbitrary tie-breaker. And if we become aware of time skew on a specific node, we could also correct for it. Many listeners only care about specific types of events. So I included the event type in the name so that only needed events had to be fully fetched. Ultimately the design ended up with pretty long file names like {timestamp}/{nodeid}/{streamid}/{event version}/{event type}.

There are huge problems with this, however. I could nitpick a bunch of them, but I will jump to the overarching problem. The scale where I need to distribute the event store (and thus worry about cross-node ordering), full replays would be impractically time consuming and the volume of events would generally be unwieldy for listeners. That's even if I could pick the perfect ordered key up front. The scale of this really needs a change in tactics.

Ultimately I decided that cases where I need grouping and ordering of data had to be exercised in the small. And if these need to be aggregated to larger scale, the smaller systems would have to publish "external" events that rolled up events to a higher granularity. For example, a lot of events may go into placing an Order and the sequence in which they happen matters a lot. But external listeners will not be interested in handling all those details. They instead prefer a single OrderPlaced event with all details included. So in the Order system I'll have a listener go back and construct one for external publishing. These could be published to a stream processing platform such as Kafka for larger integration scenarios. And Kafka already has some metaphors for how to make that work from a subscriber's perspective.

Moral of the story (applicable to ordered GUIDs I think) is that ordering in a distributed scenario wasn't the best problem for me to solve. It is possible to power through the ordering problem (also dealing with clock skew or failed clocks that claim the event happened in 1970), but it requires extra work and ongoing upkeep. And I still don't end up with a system that has the right granularity for large scale.

rhymes • Sep 10 '19

Thanks for the detailed explanation and I can see how complicating the architecture to allow ordering is not worth it in your case, especially because you have what I believe is an "event sourced" architecture, with multiple levels of granularity of events.

It is true indeed that not all listeners, especially those that are merely external consumers, are interested in that detail, in addition to the ability to replay events for those external consumers.

Our case is very limited in scope, which ultimately will become a very simple pub/sub system. The topic of what sort of "universal" IDs to choose for those events transiting the inside of the app and to be sent outside arose and thus, I opened this discuss thread.

I don't foresee any real drawbacks in choosing a guid that's also sortable right from the start, what do you think? By having IDs that are inherently sortable the consumer can use that property or ignore it as they wish

Hope this makes sense

Kasey Speakman • Sep 10 '19 • Edited

The only real downside is that making them sortable will encourage you to use and depend on that feature. And when clock skew comes into play, then the code that depends on sortability will probably behave unexpectedly. When your code processes the data out of order you could observe strange things like issuing updates for data that hasn't been inserted yet. How: On a cloud provider, they may restart your code on a different node (for failure or maintenance or no reason) whose clock may be skewed from the previous node. They usually do have time sync but it is best effort -- no guarantees about clock accuracy between servers. The margin of error should be small enough that you don't have a problem normally. But if you ever do have the issue it will be hard to diagnose. Also note that if a hardware clock does fail, it is common for it to reset to zero. That would be a little easier to diagnose, but either way recovery (changing the IDs? mapping them to new IDs?) could be painful.

For an event store I use a big integer called Position (not an auto increment) to have total ordering for that node. It guarantees that whatever happens later has a larger Position than whatever happens before. But it doesn't provide a global perspective. You can get a global order by reading from nodes in Position order and then choosing the lowest timestamp among those events. When you do it that way, you know it is a "best effort" ordering, and you may be able to account for some known skewed timestamps. Even if timestamps are off, you are guaranteed that events from the same node are in order. I haven't actually had to do a global order across nodes yet, but there is one on the horizon to merge different event stores.

Kasey Speakman • Sep 16 '19

Forgot to mention. Random UUIDs still used to identify entities on top of the node-wide Position and event timestamps.

Ste Griffiths • Sep 7 '19

Instagram use a custom scheme (instagram-engineering.com/sharding...) as they need time-sortable unique IDs across multiple DB servers (shards). They implemented it in PL/SQL, composing an ID from the current time, the shard ID, and an auto-increment, giving 1024 IDs per shard per millisecond. Pretty cool, check out the link.

rhymes • Sep 7 '19

It's pretty cool indeed! Thank you! In our case I think it should be abstracted from the DB but that's just an implementation detail

Yaser Al-Najjar • Sep 7 '19 • Edited

I love simplistic solutions, here is my 0.2 cents:

Use a prefix for each node like n1 / n2

Generate an incremental ID: 1, 2, 3...

So you will end up with IDs like:

n1-1
n1-2
n2-1
n2-2

Easy to sort and never gonna conflict.

Does that help with your use case?

rhymes • Sep 7 '19

Not exactly, what if the node is a client and for any reason they set the same name? What if the node changes name or is replaced by another one?

Aside from the sortable part, the problem of conflict is already solved by using UUIDs, going back to incremental IDs is not feasible in this case.

Unfortunately this, as many things in distributed systems, is not that simple.

Yaser Al-Najjar • Sep 7 '19

I get it.

You wanna have a long-term solution, however, you still didn't tell us your exact use case so we are throwing bunch of out-of-context solutions.

BTW, this might be a very interesting solution (using both UUID and incremental ID): tomharrisonjr.com/uuid-or-guid-as-...

rhymes • Sep 7 '19 • Edited

You wanna have a long-term solution

I think this is one of those few cases when there's no cost into having a correct long term solution instead of having to change the format of IDs (and thus having multiple sets of them) in time.

you still didn't tell us your exact use case so we are throwing bunch of out-of-context solutions.

I didn't write the final goal explicitly but I wrote the requirements. The final goal is to traceable unique IDs for events in a distributed system that also are sortable which is a very handy property. There's no mistery to it :D

BTW, this might be a very interesting solution (using both UUID and incremental ID): tomharrisonjr.com/uuid-or-guid-as-...

I agree that exposing IDs is not great, not even UUIDs in theory. In that case is best to have multiple keys.

It's an interesting approach but it's tied to a database. I want something that would work at each level (the programming language, the DB table, an event log and so).

Thanks for sending me suggestions though, it's really great to see all of these approaches!!

Matteo Joliveau • Sep 7 '19

May I ask what the need for sorting is?
I personally don't see many reasons to sort by id, most often what people really want is sort by creation date.

Anyway, back in topic. This a tricky question because globally unique and predictable are two properties in direct clashing with each other. A solution is taking a look at Microsoft SQL Server Sequential ID, a sortable GUID.
There a couple of disadvantages:

They are not decentralized. If they were you would lose the anti-clashing capabilities of GUIDs.
They are predictable. Don't ever expose them to the public as they can lead to resource enumeration.

Basically, you may as well use BIGINT as your ids if you need sorting.

If you need to sort events on a distributed system, better use some other properties. Like timestamps.

rhymes • Sep 7 '19 • Edited

May I ask what the need for sorting is?

Sure, efficiency and convenience mostly. Let's say we use UUIDs, they work mostly well until these IDs land in a place far from your system. Someone decides to store events on S3 using the UUID as a file, suddenly you have gigabytes of events that can't sort well unless you peek inside the file to find the timestamp.

Or you grep an event log and suddenly you have to come up with a combination of bash commands in a pipe to extract both the ID and the timestamp to sort them.

Or you want to create shards out of them to group related data in different machines, UUIDs are useless for that.

In my experience UUIDs are great, until they aren't :D

The great thing about globally unique and sortable IDs is that they carry information with them. If well designed I can even deconstruct them to extract such info.

This a tricky question because globally unique and predictable are two properties in direct clashing with each other.

Not exactly true, see the examples at the end of my comment :)

A solution is taking a look at Microsoft SQL Server Sequential ID, a sortable GUID.

This definitely wouldn't work as you said, they are basically sequential as the name says.

If you need to sort events on a distributed system, better use some other properties. Like timestamps.

But timestamps are not globally unique and can be duplicate.

I'll leave you with two examples of partially sortable IDs that are also random and unique:

K-Sortable Globally Unique IDs, or ksuid, is a type of ID that has an encoded 32bit timestamp with 1 second resolution prefix and a 128bit random part as a suffix. The explanation of how Segment got there is great: A brief history of the UUID
Universally Unique Lexicographically Sortable Identifier, or ulid, this ID uses 48 bit with millisecond resolution for timestamps and 80 bits of randomness.

For example, Firebase uses something like this for their IDs: The 2 ^ 120 Ways to Ensure Unique Identifiers

William Gottschalk • Sep 7 '19 • Edited

Have you tried ULIDs before? They’re lexicographically sortable ids unique to the ms. honeybadger.io/blog/uuids-and-ulids/

rhymes • Sep 7 '19 • Edited

Before writing this I searched and read a bit of stuff on the internet, so I ultimately encountered the ULID spec and loved it: the bit when they say: "Won't run out of space 'til the year 10889 AD" made me laugh.

I didn't mention it because I wanted to see if there were other solutions and to start a fresh discussion without "bias".

I also didn't read this blog post before, so thank you!

My only worry about adopting ULIDs is mentioned in the article: