DEV Community

Cover image for Auto-Generate sitemap.xml in Next.js
Martin Beierling-Mutz
Martin Beierling-Mutz

Posted on

Auto-Generate sitemap.xml in Next.js

Hurray! You created all the components and styling for your beautiful and performant Next.js website. What now?

There are some key files you want to serve in the root of your exported package, but Next.js only supports copying files from the /static folder out of the box. But how to add e.g. a sitemap.xml, even in an automated and always up-to-date way?

Let me show you how you could set this up for a 'yarn export'ed Next.js project.

Basic sitemap.xml structure

First, we'll need to take a look at the info a basic sitemap needs to have.

A list of...

  • URLs to each available page,
  • and an accompanying date, to let the search engine bot know where to find a page and when it was last changed.

That's it! If you want more info, you can check out Google's "Build and submit a sitemap" site.

Gathering needed info

Before we can write the file into our exported /out folder, we'll have to actually get the info we need: page url's & last modified dates.

To do this, I've built this function, which returns all files' paths inside the /pages folder:

module.exports = () => {
  const fileObj = {};

  const walkSync = dir => {
    // Get all files of the current directory & iterate over them
    const files = fs.readdirSync(dir);
    files.forEach(file => {
      // Construct whole file-path & retrieve file's stats
      const filePath = `${dir}${file}`;
      const fileStat = fs.statSync(filePath);

      if (fileStat.isDirectory()) {
        // Recurse one folder deeper
        walkSync(`${filePath}/`);
      } else {
        // Construct this file's pathname excluding the "pages" folder & its extension
        const cleanFileName = filePath
          .substr(0, filePath.lastIndexOf("."))
          .replace("pages/", "");

        // Add this file to `fileObj`
        fileObj[`/${cleanFileName}`] = {
          page: `/${cleanFileName}`,
          lastModified: fileStat.mtime
        };
      }
    });
  };

  // Start recursion to fill `fileObj`
  walkSync("pages/");

  return fileObj;
};
Enter fullscreen mode Exit fullscreen mode

This will return an object, which looks like this for my website at the moment of writing:

{
  "/blog/auto-generate-sitemap-in-next-js": {
    "page": "/blog/auto-generate-sitemap-in-next-js",
    "lastModified": "2018-10-03T00:25:30.806Z"
  },
  "/blog/website-and-blog-with-next-js": {
    "page": "/blog/website-and-blog-with-next-js",
    "lastModified": "2018-10-01T17:04:52.150Z"
  },
  "/blog": {
    "page": "/blog",
    "lastModified": "2018-10-03T00:26:02.134Z"
  },
  "/index": {
    "page": "/index",
    "lastModified": "2018-10-01T17:04:52.153Z"
  }
}
Enter fullscreen mode Exit fullscreen mode

As you can see, we have all the info we need to build our sitemap!

Creating the file when exporting

In Next.js, when you create your static files package, you'll typically run yarn build && yarn export. We want to hook-in after the export, to create the sitemap.xml file in the /out folder.

To hook into any scripts defined the package.json, we can add another script with the same name, but prefixed with "post";

The new package.json scripts section will look like this:

...
"scripts": {
    "dev": "next",
    "build": "next build",
    "start": "next start",
    "export": "next export",
    "postexport": "node scripts/postExport.js"
  },
...
Enter fullscreen mode Exit fullscreen mode

I chose to create a new folder "scripts" and create the "postExport.js" file in there. This script will now run after every "yarn export" call.

Generate the sitemap.xml contents

This scripts/postExport.js file will utilize the function we created previously to get all needed info:

const pathsObj = getPathsObject();
Enter fullscreen mode Exit fullscreen mode

Then, we'll create the sitemap.xml content & file:

const sitemapXml = `<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> 
  ${Object.keys(pathsObj).map(
    path => `<url>
    <loc>https://embiem.me${path}</loc>
    <lastmod>${
      formatDate(new Date(pathsObj[path].lastModified))
    }</lastmod>
  </url>`
  )}
</urlset>`;

fs.writeFileSync("out/sitemap.xml", sitemapXml);
Enter fullscreen mode Exit fullscreen mode

That's it! Well almost. I did use a formatDate function, to get the desired string format for our date.

You could just do a .substr(), as pathsObj[path].lastModified already contains an ISO formatted date, or use some library like date-fns. I decided to copy a working solution from the web:

module.exports = function formatDate(date) {
  var d = new Date(date),
    month = "" + (d.getMonth() + 1),
    day = "" + d.getDate(),
    year = d.getFullYear();

  if (month.length < 2) month = "0" + month;
  if (day.length < 2) day = "0" + day;

  return [year, month, day].join("-");
};
Enter fullscreen mode Exit fullscreen mode

Now run yarn export and a file at out/sitemap.xml appears!

Challenge! Create robots.txt

Based on this, it should be easy for you to create a robots.txt with your desired contents now.

If you want to know how I did it, check out the scripts folder in my website's repo.

Let me know whether you'd solve this differently.

Afterword

This was my first post in this community πŸ‘‹. I plan to post much more in the future. You can find the original post on my Blog.

Top comments (5)

Collapse
 
studiospindle profile image
Remi Vledder

Another example for a next js site. I've adjusted it slightly to also handle sub-folders:

const fs = require('fs');
const path = require('path');

const removeFileExt = (dirname) => {
  const parsedPath = path.parse(dirname);
  if (parsedPath.dir) {
    return `${parsedPath.dir}/${parsedPath.name}`;
  }
  return parsedPath.name;
};

module.exports = () => {
  const fileObj = {};

  const walkSync = (dir) => {
    const files = fs.readdirSync(dir);
    files.forEach((file) => {

      // specific for Next, skip files starting with '-' or between brackets '[...]'
      const partialNextFile = /^_\w*/;
      const dynamicNextFile = /^\[(.*?)\]/;
      if (partialNextFile.test(file) || dynamicNextFile.test(file)) {
        return;
      }

      const fullPath = `${dir}${file}`;
      const fileStat = fs.statSync(fullPath);

      if (fileStat.isDirectory()) {
        walkSync(`${fullPath}/`);
      } else {
        const parsedPath = path.parse(fullPath);
        if (!parsedPath.ext) {
          // skip system files
          return;
        }

        const filePathAry = path.format(parsedPath).split(path.sep);
        filePathAry.splice(0, 3);

        const fileDir = filePathAry.join('/');
        const cleanPath = removeFileExt(fileDir);

        fileObj[`/${cleanPath}`] = {
          page: `/${cleanPath}`,
          lastModified: fileStat.mtime,
        };
      }
    });
  };

  walkSync('./src/pages/');
  walkSync('./src/posts/');

  return fileObj;
};
Collapse
 
studiospindle profile image
Remi Vledder

Very nice! I've made some changes in my project, perhaps this is of any use to someone.

In the walkSync method I used the 'dir' parameter to replace pages/. This makes the method reusable for any other paths such as posts.

// ... not shown for brevity
const cleanFileName = filePath.substr(0, filePath.lastIndexOf('.')).replace(dir, '');
// ... not shown for brevity

Also, by using an early return I filtered out any files that do not compile to an actual file:

// ... not shown for brevity
files.forEach((file) => {
      const partialNextFile = /^_\w*/; // any filenames starting with _*
      const dynamicNextFile = /^\[(.*?)\]/; // any filenames between brackets [*]
      if (partialNextFile.test(file) || dynamicNextFile.test(file)) {
        // skip
        return;
      }
// ... not shown for brevity
Collapse
 
helloguille profile image
Guillermo Gonzalez • Edited

Thanks! I think there's a bug in the code:

${Object.keys(pathsObj).map(
path => <url>
<loc>https://embiem.me${path}</loc>
<lastmod>${
formatDate(new Date(pathsObj[path].lastModified))
}</lastmod>
</url>

)}

Should be:

${Object.keys(pathsObj).map(
path => <url>
<loc>https://embiem.me${path}</loc>
<lastmod>${
formatDate(new Date(pathsObj[path].lastModified))
}</lastmod>
</url>

).join("")}

The original version is outputting commas between each XML element.

Collapse
 
chadarri profile image
chadarri

Hello Martin,

I am trying to add the sitemap and robots.txt files to my web NextJs app. I follow your post and I generated the two files without any problems but what is the next step to make accessible via URL the file in production ?

The two files are in the Out directory but they are not accessible in the url online.

Sorry for my bad english, I hope you will understand my problem.

Charlotte

Collapse
 
embiem profile image
Martin Beierling-Mutz

Hey. This really depends on your deployment setup. As long as all files in the out folder are pushed to your static file host, you should be good.