DEV Community

Kacper Włodarczyk
Kacper Włodarczyk

Posted on • Edited on

2 1 1 1

[Python] How to get full url from shortened url

Sometimes when you scrape a website, you may have encountered the fact that the website returns shortened URLs to sources from other websites.

As in this case, for example, https://upflix.pl/r/Qb64Ar this link consists of a domain and some random characters. The way a shortened link works is that it redirects you to another page. Therefore, the status_code that our query returns is 302

Sometimes it happens that we need a full URL to get that can do this with a few lines of Python code and the requests library.

pip install requests
Enter fullscreen mode Exit fullscreen mode

We will use the head method to perform this function
This method is similar to get with the difference that it does not return any content, only headers.

response = requests.head(short_url)
Enter fullscreen mode Exit fullscreen mode

After executing the query, we can check the headers that were returned.
There is information here such as:

  • date
  • type of website content
  • character encoding
  • FULL LINK and many other information you can see below.
{'Date': 'Thu, 16 Nov 2023 00:43:13 GMT', 'Content-Type': 'text/html; charset=UTF-8', 'Connection': 'keep-alive', 'location': 'https://www.imdb.com/title/tt14060708/', 'vary': 'Origin', 'x-powered-by': 'PHP/7.3.33', 'x-frame-options': 'SAMEORIGIN', 'CF-Cache-Status': 'DYNAMIC', 'Report-To': '{"endpoints":[{"url":"https:\\/\\/a.nel.cloudflare.com\\/report\\/v3?s=bgvCMcMQg1ZkjanlgqzemKUHHthalhb%2FAT72Q58O8a22eFmkeb%2FyeeIMfKkGFwt8WmkMB6dv28F1G2CdH134Kilk%2BcdQNweIZ3O%2FN9KlQf1A2VF%2Bm3yYT89rvjU%3D"}],"group":"cf-nel","max_age":604800}', 'NEL': '{"success_fraction":0,"report_to":"cf-nel","max_age":604800}', 'Strict-Transport-Security': 'max-age=15552000; includeSubDomains; preload', 'X-Content-Type-Options': 'nosniff', 'Server': 'cloudflare', 'CF-RAY': '826bb2ce597ebfda-WAW', 'alt-svc': 'h3=":443"; ma=86400'}
Enter fullscreen mode Exit fullscreen mode

Full code

import requests
from typing import Optional 

def get_full_url(short_url: str) -> Optional[str]
    response = requests.head(short_url)
    if response.status_code == 302:
        headers = response.headers
        return headers["location"]

    return None 

Enter fullscreen mode Exit fullscreen mode

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

Top comments (2)

Collapse
 
m0n0x41d profile image
Ivan Zakutnii
import requests

full_url = requests.get("https://shorturl.at/cikuPxx").url
Enter fullscreen mode Exit fullscreen mode

It will return either the full URL or an alternative string, depending on the shortener provider. Typically, this alternative string is the shortener service URL if the link is broken.

Collapse
 
axsddlr profile image
Recluse • Edited

missing optional import from typing and colon after optional

from typing import Optional
import requests
def get_full_url(short_url: str) -> Optional[str]:

AWS Security LIVE!

Tune in for AWS Security LIVE!

Join AWS Security LIVE! for expert insights and actionable tips to protect your organization and keep security teams prepared.

Learn More

👋 Kindness is contagious

Immerse yourself in a wealth of knowledge with this piece, supported by the inclusive DEV Community—every developer, no matter where they are in their journey, is invited to contribute to our collective wisdom.

A simple “thank you” goes a long way—express your gratitude below in the comments!

Gathering insights enriches our journey on DEV and fortifies our community ties. Did you find this article valuable? Taking a moment to thank the author can have a significant impact.

Okay