In most CRUD operations and REST APIs, primary keys are used to reference models that you want to access or modify. A majority of APIs will take an ID as a parameter in a route:
GET /api/v1/posts/:id
// Return the Post resource with an ID of 457
GET /api/v1/posts/457
While it's the simplest and most effective way of specifying the model to use, we often don't want to show these IDs to the user. By displaying primary keys, you give users the ability to estimate the number of rows in your tables. If authorization isn't effective or routes aren't protected, users could input random numbers to access information that they otherwise shouldn't have.
Using obfuscated IDs can be useful in social media or feed contexts, where the content isn't used in the URL, but you want something less significant than a primary key. As an example, instead of showing the user a URL like this:
https://romansorin.com/posts/457
We may want to show them something like this instead:
https://romansorin.com/posts/akz1JV
In this case, you may want to use "hashes" as a way to obfuscate your ID. We'll use the lightweight Hashids package to make this happen.
Installation
Getting started with Hashids is simple. With your preferred package manager, add Hashids to your project:
# If you use yarn as a package manager
yarn add hashids
# Or if you use npm
npm install hashids
Usage
I've provided a Javascript example to begin working with Hashids, but Hashids has support for several languages!
Here's a brief use case, where you may want to hide the ID of a post:
import hashids from "hashids";
// Two arguments supplied: a salt and a minimum padding (length)
const postHash = new hashids("post", 8);
const post = { id: 4 };
post.id; // 4
const hashedPostId = postHash.encode(post.id);
hashedPostId; // 6akz1JVq
postHash.decode(hashedPostId); // [4]
Here, we're importing the hashids
package and creating an instance of the module, calling it postHash
.
I set up a dummy post object, but you can use any object or ID that you see fit. Due to package limitations, the argument supplied to the module must be an integer. Strings and objects cannot be obfuscated using Hashids.
Afterward, I supplied the ID into the encode
function of the postHash
object and then put this output back through the decode
function to show how you can use encoding/decoding. Note that the return type of decode
is an array, not an integer.
If that's all you're looking for, then that's it! You can also encode and decode multiple IDs at once:
const hashes = postHash.encode([1, 2, 3]);
postHash.decode(hashes); // [1, 2, 3]
Bonus: A utility class
If you want a common utility to work with, here is an abstraction on top of the Hashids package that will allow you to encode and decode IDs easily, without having to remember the package's methods.
This class is limited to encoding/decoding a single ID at a time but it helps me stay consistent within my projects. By using this utility, you could also set up a file/store of your hash objects, so you don't have to redefine it across your application:
// lib/Hash.ts
const Hashids = require("hashids/cjs");
export class Hash {
private hashids;
/**
* Creates a Hash object.
*
* @param {string} salt The unique salt/alphabet to use for salting. Setting a salt allows output hashes to be more unique.
* @param {number} padding The minimum output length of the hash (default is 6).
*/
constructor(salt: string = "", padding: number = 6) {
this.hashids = new Hashids(salt, padding);
}
/**
* Encodes the provided ID argument and returns a string representing the hash.
*
* @param {number} id The numeric "id" to be encoded or turned into a hash.
* @returns {string} Returns the encoded ID in the form of a hash, e.g. "o2fXhV"
*/
public encodeId(id: number) {
return this.hashids.encode(id);
}
/**
* Decodes the provided hash argument and returns a number representing the ID.
*
* @param {string} id The numeric "id" to be encoded or turned into a hash.
* @returns {number} Returns the numeric ID, e.g. "1"
*/
public decodeId(hash: string) {
const decoded = this.hashids.decode(hash);
return decoded.length > 1 ? decoded : decoded[0];
}
/**
* Sets the internal hashids object with the provided salt/padding arguments.
*
* @param {string} salt The unique salt/alphabet to use for salting. Setting a salt allows output hashes to be more unique.
* @param {number} padding The minimum output length of the hash (default is 6).
*/
public setHashids(salt: string = "", padding: number = 6) {
this.hashids = new Hashids(salt, padding);
}
}
Using this utility class is as simple as the native Hashids package. The implementation stays largely the same, but may be more readable and easy to remember:
const { Hash } = require("@lib/Hash");
const { Post } = require("@app/models/Post");
// Create a new Hash object with the salt "post"
const postHash = new Hash("post", 8);
// We may want to generate different sequences based on model, to get different values for the same ID
const userHash = new Hash("user", 8);
post = new Post();
post.id; // 4
const hashedPostId = postHash.encodeId(post.id);
hashedPostId; // 6akz1JVq
postHash.decodeId(hashedPostId); // 4
// Want to change the salt of the Hash object without creating a new object?
// Call "setHashids" through the utility function.
postHash.setHashids("comment", 8);
postHash.decode(hashedPostId); // Now, it returns undefined instead of 4
// With a different salt, we can use the old Post ID and get a different value:
const hashedUserId = userHash.encodeId(post.id);
hashedPostId; // dD0WnjRy
This example is a little bit more extensive, so let me walk you through it:
- We created two hash objects to represent a Post and User model.
- Like the previous example, I created a dummy Post object with an ID of 4.
- I passed the ID into the encode function (of the custom utility) and then decoded it, which was the same as the previous example.
- The utility allows you to set a new salt and padding within the same object instance, so I changed the salt to "comment". Now, when you try to decode the previous hash, you don't get the same ID.
- Since the
userHash
object had a different salt, encoding the previous ID returns a completely different hash.
Unfortunately, a limitation of this utility is that you can't encode or decode multiple IDs at once, but this can be easily added in by extending the class functions. When developing a medium-scale app with the Hashids library, I found this utility to be super useful in keeping my code consistent across controllers.
Limitations
It's worth noting that Hashids should not be used as a legitimate hashing solution (such as passwords or other encrypted information). The Hashids package doesn't have support for strings anyway, but you shouldn't even consider this – instead, use an algorithm like bcrypt to encrypt your sensitive data.
Additionally, as the package creator describes, these aren't true "hashes". Cryptographic hashes cannot be decrypted, but the output looks similar which results in this obfuscation being considered a "hash".
More robust strategies
When I was looking into packages and solutions to masking IDs in an application of my own, my first thought was to look into what companies like Instagram and Twitter were doing. I noticed that despite the volume of data that is processed on these platforms, they didn't resort to using primary keys for their URLs. If you're interested in how they handled this ID generation (hint: it wasn't Hashids!), I would highly suggest reading the articles I linked above.
In the Medium post documenting Instagram's solution, the URL contains yet another example of hashes being used in the URL: first the slug of the article, and then a sequence of random characters after to maintain uniqueness.
https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c
In a content-heavy context where the title of a post may be significant (blogs or forums), this approach keeps the URL significant but also minimizes the chance of collisions by keeping records unique.
Hashids is an effective approach for a small to medium-scale application that doesn't require more complicated strategies like combining different metadata (creation date, worker/sequence count, shard IDs). While it doesn't fit data or scale-intensive cases like Twitter, regular applications that process a reasonable amount of writes will do just fine with this approach. Collisions can be avoided by choosing unique salts for each model that you obfuscate and an appropriate minimum length (at least 8 characters).
Top comments (0)