loading...
Cover image for Clean URL slugs: The Good, the Bad, and the Ugly

Clean URL slugs: The Good, the Bad, and the Ugly

michi profile image Michael Z Originally published at michaelzanggl.com ・4 min read

A slug is the last part of the URL and identifies a specific page on a site.

For example, on my blog, the slug of https://michaelzanggl.com/articles/tailwind-css-for-skeptics/ is tailwind-css-for-skeptics. This is a clean URL as opposed to something like https://michaelzanggl.com/articles/232.

The Good

As you can imagine this has some great benefits. It makes for a human-readable URL and is great for SEO.

I remember, years ago, back in the old YouTube, people would put random/funny words in the slug to find YouTube profiles under that name to chat or leave comments on their page. When the internet was still about discovery ❤️

The Bad

The good thing about IDs is that they don't change. This is not true for text. See for example a GitHub repository:

Or an article on dev.to:

They both contain two slugs. The username and the name of the repository/article.

What happens if you change your username?

Well, GitHub has an entire article about the side effects: https://docs.github.com/en/github/setting-up-and-managing-your-github-user-account/changing-your-github-username

It says that the old username will be made available again for others to use. It goes on to say:

"
Web links to your existing repositories will continue to work.
...
If the new owner of your old username creates a repository with the same name as your repository, that will override the redirect entry and your redirect will stop working.
"

This, of course, has the potential to be exploited. If you forget to update only one link to a repo of yours in a blog post, package.json, or elsewhere, somebody can now claim that username, create a repository under the same name, and cause some pretty hefty damage under your name.

dev.to also allows changing the username but leaves you in the dark with what will happen to the old handle and all the links associated with it.

An alternative way to use slugs is the way Reddit does. See the link for the following article: https://www.reddit.com/r/javascript/comments/i1zwek/nextjs_serverside_route_authentication_using/

You can actually change the title in the link, and it would still work, as it still contains the ID. Of course one could abuse this to create clickbait links: https://www.reddit.com/r/javascript/comments/i1zwek/earn_10000_dollars_with_this_simple_trick/ but what's the point really :D.

The Ugly

Slugs might work great for English and languages that use the Roman script but look what happens when I copy the link from a product page from the Japanese Amazon store:

https://www.amazon.co.jp/Nintendo-Switch-%E3%83%8B%E3%83%B3%E3%83%86%E3%83%B3%E3%83%89%E3%83%BC%E3%82%B9%E3%82%A4%E3%83%83%E3%83%81-%E3%83%8D%E3%82%AA%E3%83%B3%E3%83%AC%E3%83%83%E3%83%89%E3%80%91-%E3%83%8B%E3%83%B3%E3%83%86%E3%83%B3%E3%83%89%E3%83%BCe%E3%82%B7%E3%83%A7%E3%83%83%E3%83%97%E3%81%A7%E3%81%A4%E3%81%8B%E3%81%88%E3%82%8B%E3%83%8B%E3%83%B3%E3%83%86%E3%83%B3%E3%83%89%E3%83%BC%E3%83%97%E3%83%AA%E3%83%9A%E3%82%A4%E3%83%89%E7%95%AA%E5%8F%B73000%E5%86%86%E5%88%86/dp/B07SVXHD1P/ref=sr_1_13?dchild=1&keywords=switch&qid=1596343204&sr=8-13

Now imagine sending such a link on an app like WhatsApp where you just have a little space. It's quite the opposite of human-readable, how ironic.

Wait, but why is this happening in the first place?

An explanation as to why the browser returns you this encoded version of the link can be found here:

The URI you get by copying from the address bar is the only valid URI the browser can give you.

From the RFC 3986 (and other URL RFCs):

A URI is a sequence of characters from a very limited set: the letters of the basic Latin alphabet, digits, and a…

Let's confirm what is said by checking the request on the Amazon site's GET request:

proof that browser is using URI

I'm not an expert on this topic, but the reason why URI, instead of IRI, is still being used to this day might be due to exploits like the IDN homograph attack.

An example is this image from the Wikipedia article:

example of homograph attack

In this image, the letters e and a were replaced with their Cyrillic equivalent.


While slugs do enhance UX, oftentimes it's only minor enhancement. We already have open graph to display title, description, image, etc. when sharing links. Sites like twitter use URL shorteners anyways. Referencing sites on Reddit, blog posts, news articles, etc. usually don't show the raw link, but provide alternative text, etc.

Conclusion

Now, slugs are still great. But it's not as simple as adding one to your website and you are done (unless you have total control over the content...).

If you plan to implement slugs into your website, make sure you

  • Educate your users about what happens when you change content that is part of a slug
  • Only include URL-friendly characters

I guess the gist of this post is that just because something looks simple in tech, it often comes with a lot of added complexity, and it's never as simple as "just doing this one little thing". The YouTube video page works just fine with just an ID in the slug.

Posted on by:

Discussion

pic
Editor guide