DEV Community

Cover image for Getting website meta tags with node.js!
Kieran
Kieran

Posted on • Updated on

 

Getting website meta tags with node.js!

Recently I was in need of a way to get meta tags for a service I was creating. So I decided to search GitHub for a solution. Unfortunately, everything either didn't work or was very slow. So here we are.


  • First off, install node-fetch and cheerio with npm.

  • Next, we need to fetch the HTML of the website we are getting the meta tags from.

 fetch("https://discord.com")
    .then(result => result.text())
    .then(html => {
        console.log(html);
    }).catch(error => {
        console.log(error);
    })
Enter fullscreen mode Exit fullscreen mode
  • Now we need to pass this HTML into Cheerio, which will allow us to find meta tags from their attributes.
 fetch("https://discord.com")
    .then(result => result.text())
    .then(html => {
        console.log(html);
+       const $ = cheerio.load(html);
    }).catch(error => {
        console.log(error);
    })
Enter fullscreen mode Exit fullscreen mode
  • The way we do this is using code like this...

find meta elements with property "og:title"

$('meta[property="og:title"]').attr('content')

get the chosen elements content attribute

  • After doing this for all the meta tags I had this...
 fetch("https://discord.com")
    .then(result => result.text())
    .then(html => {
        console.log(html);
        const $ = cheerio.load(html);
+       const title = $('meta[property="og:title"]').attr('content') || $('title').text() || $('meta[name="title"]').attr('content')
+       const description = $('meta[property="og:description"]').attr('content') || $('meta[name="description"]').attr('content')
+       const url = $('meta[property="og:url"]').attr('content')
+       const site_name = $('meta[property="og:site_name"]').attr('content')
+       const image = $('meta[property="og:image"]').attr('content') || $('meta[property="og:image:url"]').attr('content')
+       const icon = $('link[rel="icon"]').attr('href') || $('link[rel="shortcut icon"]').attr('href')
+       const keywords = $('meta[property="og:keywords"]').attr('content') || $('meta[name="keywords"]').attr('content')
+       // do something with the variables
    }).catch(error => {
        console.log(error);
    })
Enter fullscreen mode Exit fullscreen mode

You can see the finished product here and view the source on GitHub. A node module is also available here!


Sorry if this article sucked, it was my first time writing on this blog.

Top comments (1)

Collapse
 
losh531 profile image
Losh531

Amazing job!

An Animated Guide to Node.js Event Loop

Node.js doesn’t stop from running other operations because of Libuv, a C++ library responsible for the event loop and asynchronously handling tasks such as network requests, DNS resolution, file system operations, data encryption, etc.

What happens under the hood when Node.js works on tasks such as database queries? We will explore it by following this piece of code step by step.