DEV Community

loading...
Cover image for How to implement a robots.txt file in a Nuxt project — Nuxt 2.10

How to implement a robots.txt file in a Nuxt project — Nuxt 2.10

siliconmachine profile image Alan Mac Cormack ・2 min read

A robots.txt file helps us to control the way in which Google and other search engines explore and index our content.

The first thing that a robot does when gets to your site it’s to check if a robots.txt file exists and if yes, they examine it to understand how to crawl it.

It’s just a simple public text file in which we can tell the crawlers to which parts of our site should or shouldn’t be crawled and indexed by allowing and disallowing pages.

@nuxtjs/robots it’s a Nuxt module that injects a middleware to automatically generate a robots.txt file.

@nuxtjs/robots it’s a Nuxt module that injects a middleware to automatically generate a robots.txt file.

Requirements:

  • Nuxt
  • npm or yarn
  • Node

To begin with, we’ll install the module:

npm i nuxt-robots-module

Once it finishes, we are ready to go.

We’ll add it to our nuxt.config.js file in the modules array and give some options to it. As an example, I’ll use top-level options disallowing only the /user URL for all user-agents:

export default {
  modules: [
    '@nuxtjs/robots'
  ],
  robots: {
    UserAgent: '*',
    Disallow: '/user',
  }
}

But what if we want to disallow different URLs for different user-agents?
We can transform our options object into an array and add as many user-agents and urls as we want:

export default {
  modules: [
    '@nuxtjs/robots'
  ],
  robots: [
    {
      UserAgent: 'Googlebot',
      Disallow: '/user',
    },
    {
      UserAgent: '*',
      Disallow: '/admin',
    },
  ]
}

Now let’s say that we don’t want DuckDuckBot user agent to crawl any URL that we could have after /admin.

We can achieve by adding a /* to the URL like this:

export default {
  modules: [
    '@nuxtjs/robots'
  ],
  robots: [
    {
      UserAgent: 'Googlebot',
      Disallow: '/user',
    },
    {
      UserAgent: '*',
      Disallow: '/admin',
    },

  ]
}

And finally, to disallow several URLs for the same user-agent we'll do it by passing an array to the Disallow property.
The final code will look like the following:

export default {
  modules: [
    '@nuxtjs/robots'
  ],
  robots: [
    {
      UserAgent: 'Googlebot',
      Disallow: ['/user', '/admin'],
    },
    {
      UserAgent: '*',
      Disallow: '/admin',
    },

  ]
}

That’s all! You can build your application and check the file in /robots.txt URL.

Discussion

pic
Editor guide
Collapse
kp profile image
KP

@siliconmachine what is the advantage of using this module over simply adding a robots.txt file in static/robots.txt ?

Collapse
saul profile image
Saul Hardman

@kp in my experience, the biggest benefit of using a module such as @nuxtjs/robots presents itself when you pass it a function rather than a static object. This means that you can decide based on your data whether or not a path should be allowed or disallowed.

I parse a index: true | false value from the front-matter of my markdown articles and use that to dynamically generate my robots.txt on nuxt generate.

I hope that provides a good example?

Collapse
kp profile image
KP

@saul just seeing your reply as I’ve not been on here. thanks for the explanation! Would this make a difference for universal mode sites, does the module provide the same benefits?

Thread Thread
saul profile image
Saul Hardman

I'd imagine so, but only during the nuxt build step. I don't think it would be able to update based on data changes alone whilst the application is running via nuxt start. Looking into the codebase itself or posting an issue with a question should get you a clearer answer though 👍

Thread Thread
kp profile image
KP

Sounds good thanks @saul