How To Add Search Functionality to a NextJS Markdown Blog

#nextjs #typescript #react

My current blogging goal is to write a blog post a week on what I've learnt as I navigate through my software engineering career. As the weeks have gone by, my list of blog posts has grown, and it's starting to make sense for me to think about adding basic search functionality to my NextJS blog.

I started messing around in NextJS to try to figure out how I might go about this, and came across some quirks that I thought would be useful to blog about. Having now figured out an MVP of how search could work for my markdown blog, I figured I'd split this post into two as it'll probably get a bit long in the tooth otherwise.

Part 1 will focus on how to set up an internal API within NextJS, in this case, a search endpoint. I'll also describe how to generate the blog posts data cache automatically, which is what the search endpoint will query to return results.

Part 2 will focus on the frontend, and how I'll build out the UI for the React component. I haven't actually figured this part out yet, so it might be a few weeks before I bang this blog post out. 😅

So kicking off with Part 1, let me first describe the basic concept of what I decided to do.

Set up an API endpoint (NextJS has this inbuilt, so it's fairly easy to do).
Write a script that generates a cache of frontmatter data from all my markdown blog posts.
Make this a node script that's accessible through the NextJS server by configuring Webpack.
Use the husky package to add a pre-commit hook to run this script automatically whenever we add a new commit (to ensure our cache is always up-to-date).
Hook up our API endpoint to filter through the data cache to return relevant results, depending on the user query.

I don't think this is necessarily the best way of doing this, but it is a simple way to do so. This will not scale well with increasing numbers of blog posts, but will serve its purpose for now. I also don't love the fact that a new cache is generated every time I commit to git, considering I might be working on other parts of my site that are completely unrelated to adding a new blog post, but I'll stick with this for now, then optimise later.

Step 1: Set up an API endpoint in NextJS

NextJS has this as an in-built feature so it's relatively straightforward to set this up. In your pages directory, create a new folder called api. Within that, create a new file - I called it search.ts. NextJS treats any file within the pages/api directory as an API endpoint, rather than a page.

This file is basically where you define the request and response for your endpoint. NextJS provides a number of HTTP handlers and middleware to help you structure your endpoint. The documentation has more information on what's available, but what I have below is pretty standard and serves our purpose for now as a dummy endpoint (written in Typescript).

// pages/api/search.ts

import { NextApiRequest, NextApiResponse } from 'next'

type Data = {
  results: string[],
}

export default (req: NextApiRequest, res: NextApiResponse<Data>) => {
  res.statusCode = 200
  res.setHeader('Content-Type', 'application/json')
  res.end(JSON.stringify({ results: ['post1', 'post2'] }))
}

Step 2: Generate your blog posts cache

Generating a cache of blog post data, which is then used as the basis for your search, is just one way of implementing a search function. What I like about this as a starting point is that it allows me to decide exactly what it is I want to be running my search on.

This is how I thought about generating my cache.

First, figure out what you actually want to pull out from each of your individual markdown blog posts to add to the overall cache. To do this, create a function that maps through all your markdown files, then spits out a JSON string.
Second, write this JSON string to a static file. I'm saving it to the same directory, cache , that sits in the root directory, and where I've saved this script.

// cache/cache.js

import fs from 'fs'
import { cachedPostData } from '@/lib/utils'

// First step
const blogContent = await cachedPostData('blog')

// Second step
function createBlogCache(filename) {
  fs.writeFile(`./cache/${filename}.js`, blogContent, function (err) {
    if (err) {
      console.log(err)
    }
    console.log('Blog cache file written')
  })
}

createBlogCache('blog')

You can write your cachedPostData function however you think works best for your purpose, but if you're curious, this is what I've done for now. I already use the getAllPostsWithFrontMatter() function elsewhere in the setup of my NextJS blog (check out this blog post for more info , so I reused this in my newly created cachedPostData() function.

// lib/utils.ts

export async function getAllPostsWithFrontMatter(dataType: string) {
  const files = fs.readdirSync(path.join(root, 'data', dataType))
  // @ts-ignore
  return files.reduce((allPosts, postSlug) => {
    const source = fs.readFileSync(path.join(root, 'data', dataType, postSlug), 'utf8')
    const { data } = matter(source)
    return [
      {
        frontMatter: data,
        slug: postSlug.replace('.md', ''),
      },
      ...allPosts,
    ]
  }, [])
}

export async function cachedPostData(dataType: string) {
  const posts = await getAllPostsWithFrontMatter(dataType)
  return `export const cachedPosts = ${JSON.stringify(posts)}`
}

Step 3: Make your caching script accessible as a node module through NextJS's server

This part was a bit tricky. What I wanted was to be able to run this caching script as a node module, the idea being that I'd then hook it up to run automatically, every time I make a new git commit. To get it to play nicely with NextJS's architecture, I needed to run it through NextJS's compilation process i.e. going through Webpack.

To do this, I needed to make some custom amends to NextJS's Webpack config which you can find in next.config.js. The changes I made were:

To enable topLevelAwait which enables modules to act as async functions. This is still an experimental function at the time of writing in Webpack.
Adding an extra entry point, that runs the script on next build and outputs the result to .next/server/queue.js. This allows us to run the caching script with node .next/server/cache.js.

module.exports = {
  // ...

  webpack: (config, { isServer }) => {
    // Needed if your cache script is asynchronous
    config.experiments = {
      topLevelAwait: true,
    }

    if (isServer) {
      return {
        ...config,
        // This is what allows us to add a node script via NextJS's server
        entry() {
          return config.entry().then((entry) => {
            return Object.assign({}, entry, {
              cache: './cache/cache.js',
            })
          })
        },
      }
    }
    return config
  },

  // ...
}

Step 4: Run the script automatically whenever you commit locally

I'd say this step is optional. I've included it in my workflow for now, but I'm not entirely sure as yet whether I'll keep it. If you're interested in generating the cache automatically, every single time you add a git commit, read on.

A nice, easy to use package that allows you to define pre-commit hooks is husky. Note that they've recently changed the way in which pre-commit hooks are defined, so you might also want to read about the changes here. To set husky up, just follow the installation instructions on the README.

What I then did was to amend my package.json file to actually define the script I want to run on pre-commit (rather than having it hidden away in the .husky directory). What's then needed is to ensure the husky pre-commit file calls this newly defined pre-commit command.

// package.json

"scripts": {
    // ...
    "cache-posts": "node .next/server/cache.js",
    "pre-commit": "yarn cache-posts && git add cache/blog.js"
  },

// Also amend .husky/pre-commit to call pre-commit

npm run pre-commit

Step 5: Hook up our API endpoint to read the cache

Alright, final stretch now! Back on pages/api/search.ts, we now need to amend our API to actually read our cache, and filter out the relevant blog post(s) that match a user's search query.

I first defined my blogPosts variable, calling it from the saved cache.
Assuming I'd be passing the user's search query as a param called q, I defined my results by saying, "If a query is present, filter through my blogPosts and check whether there's any word(s) in the post title that matches the query. If no user query is present, just give me back all the blog posts".

import { NextApiRequest, NextApiResponse } from 'next'
import { cachedPosts } from '../../cache/blog'
import { CachedPost } from 'types'

type Data = {
  results: string[]
}

const blogPosts = cachedPosts as CachedPost[]

export default (req: NextApiRequest, res: NextApiResponse<Data>) => {
  const results = req.query.q
    ? blogPosts.filter((post) => post.frontMatter.title.toLowerCase().includes(req.query.q.toString()))
    : blogPosts
  res.statusCode = 200
  res.setHeader('Content-Type', 'application/json')
  res.end(JSON.stringify({ results }))
}

This is a very basic filtering mechanic for now, but illustrates the point. My cache as defined above, also includes other frontmatter data like tags and blog post descriptions, so I'll likely change how I define my filter going forward, but this works as a proof of concept for now.

If you're interested, this is what my CachePost type looks like. You can refer to my previous blog post on how I set up my NextJS blog to get deeper into the weeds on the rest of my types.

export type CachedPost = {
  frontMatter: BlogFrontMatter
  slug: string
}

End of Part 1

The next step from here is to then define the frontend component that the user will actually interact with i.e. some kind of input field that allows them to type in their search query. This component should then call our newly defined endpoint e.g. /api/search?q=${query}.

I'm still undecided on how to actually implement the UI/UX on my blog, so I'll leave this post here for now. Any comments / improvement suggestions for me? Let's chat on Twitter or Instagram @bionicjulia