How to implement a robots.txt file in a Nuxt project — Nuxt 2.10

#vue #nuxt #webdev

A robots.txt file helps us to control the way in which Google and other search engines explore and index our content.

The first thing that a robot does when gets to your site it’s to check if a robots.txt file exists and if yes, they examine it to understand how to crawl it.

It’s just a simple public text file in which we can tell the crawlers to which parts of our site should or shouldn’t be crawled and indexed by allowing and disallowing pages.

@nuxtjs/robots it’s a Nuxt module that injects a middleware to automatically generate a robots.txt file.

Requirements:

Nuxt
npm or yarn
Node

To begin with, we’ll install the module:

npm i nuxt-robots-module

Once it finishes, we are ready to go.

We’ll add it to our nuxt.config.js file in the modules array and give some options to it. As an example, I’ll use top-level options disallowing only the /user URL for all user-agents:

export default {
  modules: [
    '@nuxtjs/robots'
  ],
  robots: {
    UserAgent: '*',
    Disallow: '/user',
  }
}

But what if we want to disallow different URLs for different user-agents?
We can transform our options object into an array and add as many user-agents and urls as we want:

export default {
  modules: [
    '@nuxtjs/robots'
  ],
  robots: [
    {
      UserAgent: 'Googlebot',
      Disallow: '/user',
    },
    {
      UserAgent: '*',
      Disallow: '/admin',
    },
  ]
}

Now let’s say that we don’t want DuckDuckBot user agent to crawl any URL that we could have after /admin.

We can achieve by adding a /* to the URL like this:

export default {
  modules: [
    '@nuxtjs/robots'
  ],
  robots: [
    {
      UserAgent: 'Googlebot',
      Disallow: '/user',
    },
    {
      UserAgent: '*',
      Disallow: '/admin',
    },

  ]
}

And finally, to disallow several URLs for the same user-agent we'll do it by passing an array to the Disallow property.
The final code will look like the following:

export default {
  modules: [
    '@nuxtjs/robots'
  ],
  robots: [
    {
      UserAgent: 'Googlebot',
      Disallow: ['/user', '/admin'],
    },
    {
      UserAgent: '*',
      Disallow: '/admin',
    },

  ]
}

That’s all! You can build your application and check the file in /robots.txt URL.

Top comments (9)

KP • Nov 24 '19

@siliconmachine what is the advantage of using this module over simply adding a robots.txt file in static/robots.txt ?

Saul Hardman • Jan 15 '20 • Edited

@kp in my experience, the biggest benefit of using a module such as @nuxtjs/robots presents itself when you pass it a function rather than a static object. This means that you can decide based on your data whether or not a path should be allowed or disallowed.

I parse a index: true | false value from the front-matter of my markdown articles and use that to dynamically generate my robots.txt on nuxt generate.

I hope that provides a good example?

KP • May 14 '20

@saul just seeing your reply as I’ve not been on here. thanks for the explanation! Would this make a difference for universal mode sites, does the module provide the same benefits?

Saul Hardman • May 16 '20

I'd imagine so, but only during the nuxt build step. I don't think it would be able to update based on data changes alone whilst the application is running via nuxt start. Looking into the codebase itself or posting an issue with a question should get you a clearer answer though 👍