Jameer Khan

Posted on Jul 31, 2022 • Originally published at stackblogger.com

Extract Metadata Information From URL | JavaScript

#javascript #webdev #typescript #programming

The original article is published at my blog here- Extract metadata information from URL using TypeScript/JavaScript.

Metadata of a website holds the information about its search engine-related properties like title, description, site-name, site-color, and many other open-graph properties. If you are building a SEO service-related application then reading metadata properties of websites might be a required point for you. The article provides a very easy-to-use package that you can use in TypeScript/JavaScript project to extract metadata information from an URL.

Extract Metadata Information from URL

Extract metadata information from any http/https url.

Install link-meta-extractor package

Run the command to install the package in your existing TypeScript/JavaScript application. You can find the complete package here. Check the complete source code of the package here.

npm install link-meta-extractor

TypeScript Usage

Work with async/await

If you want to extract metadata information from a website using async/await then go with the following code…

import { extractMetadata } from 'link-meta-extractor';

async function extractMeta() {
  const url = 'https://stackblogger.com';
  const metaInformation = await extractMetadata(url);

  console.log(metaInformation);
}

extractMeta();

/*
    {
        title: 'StackBlogger - A blog by programmer for programmers',
        description: 'StackBlogger provide programming Tutorials, Tips, Tricks and HowTo Guides.',
        banner: 'https://stackblogger.com/wp-content/uploads/2021/10/Untitled-7-1.png',
        isItWordpress: true,
        wordpressVersion: 'WordPress 5.8.1'
    }
*/

Work with promise/callback

Extract metadata information from a website using the callback method then go with the following code…

import { extractMetadata } from 'link-meta-extractor';

function extractMeta() {
  const url = 'https://stackblogger.com';
  extractMetadata(url).then((metaInformation) => {
    console.log(metaInformation);
  });
}

extractMeta();

/*
    {
        title: 'StackBlogger - A blog by programmer for programmers',
        description: 'StackBlogger provide programming Tutorials, Tips, Tricks and HowTo Guides.',
        banner: 'https://stackblogger.com/wp-content/uploads/2021/10/Untitled-7-1.png',
        isItWordpress: true,
        wordpressVersion: 'WordPress 5.8.1'
    }
*/

JavaScript Usage

Use the following code to extract metadata information from an url in JavaScript code

Work with async/await

Extract metadata information from a website using async/await then go with the following code…

const metaExtractor = require('link-meta-extractor');

async function extractMeta() {
  const url = 'https://stackblogger.com';
  const metaInformation = await metaExtractor.extractMetadata(url);
  console.log(metaInformation);
}

extractMeta();

/*
    {
        title: 'StackBlogger - A blog by programmer for programmers',
        description: 'StackBlogger provide programming Tutorials, Tips, Tricks and HowTo Guides.',
        banner: 'https://stackblogger.com/wp-content/uploads/2021/10/Untitled-7-1.png',
        isItWordpress: true,
        wordpressVersion: 'WordPress 5.8.1'
    }
*/

Work with promise/callback

Extract metadata information from a website using the callback method then go with the following code…

const metaExtractor = require('link-meta-extractor');

function extractMeta() {
  const url = 'https://stackblogger.com';
  metaExtractor.extractMetadata(url).then((metaInformation) => {
    console.log(metaInformation);
  });
}

extractMeta();

/*
    {
        title: 'StackBlogger - A blog by programmer for programmers',
        description: 'StackBlogger provide programming Tutorials, Tips, Tricks and HowTo Guides.',
        banner: 'https://stackblogger.com/wp-content/uploads/2021/10/Untitled-7-1.png',
        isItWordpress: true,
        wordpressVersion: 'WordPress 5.8.1'
    }
*/

Additional Metadata Fields Extraction

The plugin accepts additional fields as optional arguments that you can use to extract from a website.

Pass the meta field keys in string format as a rest parameter in the function. Refer to the code here…

import { extractMetadata } from 'link-meta-extractor';

async function extractMeta() {
  const url = 'https://stackblogger.com';
  const metaInformation = await extractMetadata(
    url,
    'og:site_name', // additional field
    'og:image', // additional field
    'robots' // additional field
  );

  console.log(metaInformation);
}

extractMeta();

/* 
    {
        title: 'StackBlogger - A blog by programmer for programmers',
        description: 'StackBlogger provide programming Tutorials, Tips, Tricks and HowTo Guides.',
        banner: 'https://stackblogger.com/wp-content/uploads/2021/10/Untitled-7-1.png',
        isItWordpress: true,
        wordpressVersion: 'WordPress 5.8.1',
        additional: {
            siteName: 'StackBlogger',
            image: 'https://stackblogger.com/wp-content/uploads/2021/10/Untitled-7-1.png',
            robots: 'index, follow, max-image-preview:large, max-snippet:-1, max-video-preview:-1'
        }
    }
*/

Final Say

Extract metadata information in just one line of code with a simple plugin
Checkout my other exciting JavaScript related articles here.

DEV Community

Extract Metadata Information From URL | JavaScript

Extract Metadata Information from URL

Install link-meta-extractor package

TypeScript Usage

Work with async/await

Work with promise/callback

JavaScript Usage

Work with async/await

Work with promise/callback

Additional Metadata Fields Extraction

Final Say

Top comments (0)

Read next

Workshop: make your first AI app in a few clicks with Python+Ollama+llama3

Practical and Beginner friendly guide for speeding up your web-apps

7 Tips to Choosing the Right Website Development Platform

Day 29 of 30-Day .NET Challenge: Generics & Custom Interfaces