DEV Community

sana
sana

Posted on

System Design - URL Shorteners

URL shorteners are ubiquitous on the internet, and most of us have encountered them, be it in the form of a shortened YouTube link, a Bitly URL, or a QR code leading us to a website. But have you ever stopped to wonder how these work? Or considered building one yourself?

What is a URL Shortener?

A URL shortener is a web service that converts a regular URL into a significantly shorter one that redirects to the original URL. These shorter versions are particularly handy in contexts where character count matters, such as Twitter, or when you want a memorable or neat URL.

Example:
https://www.example.com/very/long/url/structure might become: https://short.ly/abcd1234

Why Use a URL Shortener?

Convenience: They make sharing links easier, especially on platforms with character restrictions.
Readability: They simplify long and complicated URLs.
Analytics: Many URL shortening services provide data on link clicks, the geographic distribution of the audience, referral sources, and more.
Customization: Some services allow custom aliases, turning generic links into branded ones.
The Mechanics of URL Shortening

At a high level, the process is straightforward:

The original URL is provided to the shortening service.
The service returns a shorter URL.
When the shorter URL is accessed, the service redirects the user to the original URL.
Now, let's dive into some simple algorithms that can be used for building URL shorteners. These are simple examples, and real-world URL shorteners can be much more complex depending on various usage parameters.

Hash-Based Shortening

This is perhaps the most straightforward method. A hash function like MD5 is used to generate a fixed-length hash of the original URL. This hash is then truncated to create a short URL.

For instance, a simple Java implementation can be:

byte[] digest = md.digest(longURL.getBytes(StandardCharsets.UTF_8));
StringBuilder sb = new StringBuilder();

for (int i = 0; i < SHORT_URL_LENGTH; i++) {
int index = Math.abs(digest[i] % CHARACTERS.length());
sb.append(CHARACTERS.charAt(index));
}

String shortURL = sb.toString();
Pros:

Deterministic: The same URL will always produce the same short URL.
Cons:

Potential for collisions: Different URLs could produce the same hash.
Counter-Based Encoding (Bijective)

This method usually employs a database. When a URL is submitted, it's stored in the database, and the unique ID from the database is converted into a unique URL-friendly string.

Pros:

Collision-free: Each URL gets a unique database ID.
Cons:

Requires database maintenance.
Character and timestamp Encoding

A hybrid approach combines random characters and current timestamps. This method generates unique short URLs for each long URL, even if requested within short time intervals.

For instance, a simple Java implementation can look like:

long currentTimestamp = System.currentTimeMillis();
StringBuilder sb = new StringBuilder();
Random random = new Random();

while (sb.length() < SHORT_URL_LENGTH) {
int index = random.nextInt(CHARACTERS.length());
sb.append(CHARACTERS[index]);
}

String shortURL = sb.toString() + currentTimestamp;
Pros:

A high degree of uniqueness.
Cons:

Longer URLs compared to purely hash-based methods. Although the length can be controlled with a possible trade-off on uniqueness.
Real-world Examples

You must have used one or the other URL shortener in your daily lives. Some of the common URL shorteners available on the internet today are:

Bitly: Perhaps the most recognized URL shortener, Bitly isn't just about making URLs concise. It's also a powerful marketing tool. Beyond just link shortening, Bitly provides detailed analytics about the audience clicking on their links. Businesses can view the geographical distribution of their audience, understand referrers, and even integrate with other marketing tools.
TinyURL: A venerable veteran in the URL shortening arena, TinyURL made link compression mainstream. It's straightforward to use, and its browser extension means users can instantly shorten a link without needing to visit the TinyURL site. Moreover, TinyURL has ventured into QR codes, allowing users to convert URLs into scannable QR codes.
Google (Goo.gl): Google's foray into URL shortening was widely embraced due to its straightforward nature and integration with other Google products. Though it's now deprecated, Goo.gl was also renowned for its analytics capabilities, providing users with data on click rates, referrer sites, and more.
T2M: T2M goes beyond just shortening URLs. It offers a QR Code service. This feature lets users convert short URLs into QR codes which can be printed on physical products or advertisements, connecting offline consumers to online content.
Ow.ly: Operated by Hootsuite, Ow.ly is particularly popular among social media marketers. It doesn't just truncate URLs; it provides analytics on social media performance, allowing users to track how their links perform across different platforms.
Rebrandly: As the name suggests, Rebrandly lets users create short URLs that can be branded. This means businesses can use their domain names to create trust with their audience. For instance, a pizza chain named "PizzaHot" could use
"PizzaHot.link/offer" as a short URL, making the link recognizable and memorable.
YOURLS: "Your Own URL Shortener" or YOURLS is an open-source and self-hosted URL shortener, allowing tech-savvy users and businesses to run their URL shortening service. This provides them with complete control over the shortening process, data, and analytics.
These real-world examples underscore the versatility and utility of URL shorteners. From simple link reduction for ease of sharing to powerful marketing tools integrated with analytics, URL shorteners have evolved to serve diverse needs in the digital landscape.

Implementing a URL Shortener

Creating your URL shortener can be an interesting project, offering both insights into web development and the mechanics of URL shortening. Here’s a comprehensive guide:

Select an Algorithm: Your choice of algorithm will influence the shortener's efficiency, the likelihood of collisions, and the storage needs. You might opt for a hash-based, counter-based, or character & timestamp encoding approach.
Set Up a Web Service: A framework that will allow you to receive and respond to web requests. For Python enthusiasts, Flask is a minimalistic choice, while Node.js users might prefer Express. Java developers can look towards Spring Boot, and Rubyists have Sinatra or Rails.
Storage Mechanism:
In-memory storage like dictionaries or HashMaps offer fast access times but lack persistence. Once the server restarts, data could be lost.
Relational databases such as PostgreSQL, MySQL, or SQLite provide persistence and can store additional data, like click counts or creation dates.
NoSQL databases like MongoDB are schema-less and can handle vast amounts of data, offering horizontal scalability.
Handle Collisions: If using hash-based methods, devise a strategy to handle hash collisions. You could:
Check the storage for existing hashes and re-hash until a unique value is found.
Append or prepend a random character or timestamp to the original URL and then hash again.
Implement Redirection: When a user accesses the shortened URL, your server should redirect them to the original URL. This typically involves a 301 Moved Permanently HTTP response.
Analytics (Optional):
Basic: Count the number of clicks for each short URL.
Advanced: Track user agents to identify device types, capture referrers to understand the source of the traffic, and log IP addresses to determine geographical distribution.
Customization & Branding(Optional):
Allow users to customize the path of their shortened URL. Instead of short.ly/abcd1234, they might prefer short.ly/SummerSale.
If your audience includes businesses, consider letting them use custom domains for branding. For instance, Coca-Cola could use coke.link/SummerSale.
Implement Safety Measures:
Rate limiting: To prevent abuse, limit how frequently a user or IP can create short URLs.
Blacklisting: Maintain a list of URLs or domains that are not allowed to be shortened due to malicious content.
Preview Feature: Some shorteners let users preview the destination before redirection, ensuring they aren’t led to potentially harmful sites.
Optimize for Scalability and Performance:
Implement caching mechanisms like Redis to store recently accessed URLs.
If anticipating heavy traffic, consider deploying your application across multiple servers or using a cloud service with auto-scaling.
Extend with APIs: Allow developers to programmatically create or retrieve short URLs. Offering an API can expand the user base and provide integration options with other services.
Frontend Development(Optional):
A user-friendly interface will encourage more users. Implement a clean design with responsive elements for mobile users.
Consider adding features like copying the short URL to the clipboard or displaying QR codes.
Building a URL shortener isn't just about truncating links but also considering user experience, security, scalability, and performance. Whether for personal use, a specific project, or launching a new service, a well-implemented URL shortener can be a valuable digital asset.

Show Me Some Code

I have created a simple implementation inspired by the above algorithms which serves as a great starting point for those looking to grasp the mechanics without delving deep into the complexities. This code just generates a short URL without involving the other components like server and databases.

Features & Highlights

Open-Source Code: The codebase is freely accessible, allowing enthusiasts and developers to study, modify, and even contribute.
Simplified Approach: For learners or those initiating their journey into distributed systems, this is an excellent place to start. The implementation offers clarity without the clutter of large-scale system intricacies.
Hands-On Learning: Instead of just reading about URL shorteners, you can actively engage, experiment, and even test this system to get a real feel for how URL shortening works in distributed scenarios.
Access & Contribution

The project is hosted on GitHub and can be accessed here.

Developers and enthusiasts are encouraged to explore the repository, star it for reference, and even fork it for their experiments.

Conclusion

URL shorteners might seem simple on the surface, but as we’ve seen, a lot is going on behind the scenes. Whether you're looking to create your own shortener for a project or simply to understand the magic behind your favourite shortening service, we hope this deep dive has been enlightening.

Thank you for staying with me so far. Hope you liked the article. You can connect with me on LinkedIn where I regularly discuss technology and life. Also, take a look at some of my other articles and my YouTube channel. Happy reading. 🙂

Top comments (0)