DEV Community

loading...
Cover image for Yet another URL shortener

Yet another URL shortener

antemarin_53 profile image amarin ・6 min read

Introduction

Url shortener is a service that is used to create short links from very long urls. Usually, short links have the size of one third or even one-fourth of original url which makes them easier to type, present or tweet. Clicking on a short link user will be automatically redirected to the original url.

There are many url shortening services available online like tiny.cc, bitly.com, cutt.ly, etc.

Implementing url shortening service is not a complex task and it is often part of system design interviews. In this post, I will try to explain the process of implementing the service.

Theory

Before implementation, it is always a good idea to write down what it is needed to be done in the form of functional and non-functional requirements.

Functional requirements:

  • User needs to be able to enter long url. Our service should save that url and generate a short link.

  • User should have the option to enter the expiration date. After that date passed, short link should be invalid.

  • Clicking on the short link our service should redirect the user to original long url

  • Users should create an account to use service. Service can have usage limit per user*

  • User is allowed to create his own short link*

  • Service should have metrics, for example, most visited links*

Non-functional requirements:

  • Service should be up and running 100% of time

  • Redirecting should not last longer than 2 seconds

*Requirements are optional

Url conversion:

Let's say that we want to have a short link with a maximum length of 7. The most important thing in url shortener is the conversion algorithm. Url conversion can be implemented in several different ways and each way has its pros and cons.

One way of generating short links would be hashing original url with some hash function (for example MD5 or SHA-2). When using hash function it is sure that different inputs will result in different outputs. The result of the hash is longer than 7 characters so we would need to take the first 7 characters. But in this case, there could be a collision because the first 7 characters could be already used as short link. Then we are taking the next 7 characters until we find a short link that is not used.

The second way of generating a short link is by using UUIDs. The probability that a UUID will be duplicated is not zero, it is close enough to zero to be negligible. Since UUID has 36 characters that means that we have the same problem as above, we should take the first 7 characters and check is that combination already used.

Third way would be converting numbers from base 10 to base 62. A base is a number of digits or characters that can be used to represent a particular number. Base 10 are digits [0-9] which we use in everyday life and base 62 are [0-9][a-z][A-Z]. This means that for example number in base 10 with 4 digits would be the same number in base 62 but with 2 characters.

Using base 62 in url conversion with a maximum length of 7 characters allows us to have 62^7 unique values for short links.

So how base 62 conversion works?

We have base 10 number that we want to convert it to base 62. We are going to use the next algorithm:

while(number > 0)
remainder = number % 62
number = number / 62
attach remainder to start of result collection
Enter fullscreen mode Exit fullscreen mode

After that we just need to map numbers from result collection to base62Alphabet = [0,1,2,...,a,b,c...,A,B,C,...].

Let's see how this works with a real example. In this example lets convert number 1000 from base 10 to base 62.

1st iteration:
    number = 1000
    remainder = 1000 % 62 = 8
    number = 1000 / 62 = 16
    result list = [8]
2nd iteration:
    number = 16
    remainder = 16 % 62 = 16
    number = 16 / 62 = 0
    result list = [16,8]
    There is no more iterations since number = 0 after 2nd iteration
Enter fullscreen mode Exit fullscreen mode

Mapping [16,8] to base 62 would be g8. This means that 1000 base10 = g8 base62.

Converting from base 62 to base 10 is also simple:

i = 0
while(i < inputString lenght)
    counter = i + 1
    mapped = base62alphabet.indexOf(inputString[i]) // map character to number based on its index in alphabet
    result = result + mapped * 62^(inputString lenght - counter)
    i++
Enter fullscreen mode Exit fullscreen mode

Real example:

inputString = g8
inputString length = 2
i = 0
result = 0
1st iteration
    counter = 1
    mapped = 16 // index of g in base62alphabet is 16
    result = 0 + 16 * 62^1 = 992
2nd iteration
    counter = 2
    mapped = 8 // index of 8 in base62alphabet is 8
    result = 992 + 8 * 62^1 = 1000
Enter fullscreen mode Exit fullscreen mode

Implementation

Note: Whole solution is on my Github. I implemented this service using Spring boot and MySql.

We are going to use a relational database for the auto-increment feature. The auto-incrementing number is going to be used for base 62 conversion. You can use any other database that has an auto-increment feature.

First, visit Spring initializr and there select Spring Web and MySql Driver. After that click on Generate button and download zip file. Unzip the file and open the project in your favorite IDE.

Every time I start a new project I like to create some folders to logically divide my code. Folders are controller, entity, service, repository, dto and config.

Inside entity folder, let's create Url.java class with 4 attributes - id, longUrl, createdDate, expiresDate.

Notice that there is no short link attribute. We won't save short links, we are going to convert id attribute from base 10 to base 62 every time there is a GET request. This way we are saving space in database.

LongUrl attribute is url we should redirect to once user access short link. Created date is just to see when longUrl is saved (it is not important) and expiresDate is there if a user wants to make a short link unavailable after some time.

Next, let's created BaseService.java in the service folder. BaseService contains methods to convert from base 10 to base 62 and vice versa.

private static final String allowedString = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
private char[] allowedCharacters = allowedString.toCharArray();
private int base = allowedCharacters.length;
Enter fullscreen mode Exit fullscreen mode

Like I mentioned before if we want to use base 62 conversion we need to have base 62 alphabet which in this case is called allowedCharacters. Also, the value of the 'base' variable is calculated from the length of allowed characters in case we want to change allowed characters.

Encode method takes number as input and returns short link. Decode method takes string (short link) as input and returns number. Algorithms should be implemented as explained above.

After that, inside the repository folder, let's create UrlRepository.java which is just an extension of JpaRepository and it gives us a lot of methods like 'findById', 'save', etc. We don't need to add anything else to this.

Then let's create UrlController.java in the controller folder. The controller should have one POST method for creating short links and one GET method for redirecting to original url.

@PostMapping("create-short")
public String convertToShortUrl(@RequestBody UrlLongRequest request) {
    return urlService.convertToShortUrl(request);
}



@GetMapping(value = "{shortUrl}")
public ResponseEntity<Void> getAndRedirect(@PathVariable String shortUrl) {
    var url = urlService.getOriginalUrl(shortUrl);
    return ResponseEntity.status(HttpStatus.FOUND)
    .location(URI.create(url))
    .build();
}
Enter fullscreen mode Exit fullscreen mode

POST method has UrlLongRequest as its request body. It is just class with longUrl and expiresDate attributes.

GET method takes short url as path variable and then gets and redirect to original url.

At the top of the controller is UrlService injected as dependency which will be explained next.

UrlService.java is where most logic is and service which is used by controller. ConvertToShortUrl is used by POST method from the controller. It just creates a new record in the database and gets id. That id is then converted to base 62 short link and returned to controller.

GetOriginalUrl is a method used by GET method from controller. It first converts string to base 10 and the result of that is id. Then it gets record from database by that id and throws an exception if it does not exist. After that it returns original url to the controller.

And that is it. We have implemented everything to have a working url shortening service!

Conclusion

Url shortening service is a simple service that takes long url and converts it to a short link. Once that link is visited, the user is redirected to the original url.

In this article, I explained the theory behind shortening service and showed how to implement one. In the next article, I will explain some 'advanced' things like docker, cache, data partitioning and job to automatically deletes expired links.

Discussion (1)

pic
Editor guide