DEV Community

Vincent A. Cicirello
Vincent A. Cicirello

Posted on

generate-sitemap action

My Workflow

As someone who is not fond of web development, automating any part of the maintenance of my personal website is a huge plus for me, especially something like a sitemap that is not critical to the content, but which is potentially useful to search engines for determining which parts of a site should be crawled more often. I had been using a script that I would run locally (when I'd remember) when adding or updating content to my website prior to pushing content to Github Pages.

I decided to reimplement my sitemap generator in Python as a Github Action, generate-sitemap, which I've also shared in the Github Marketplace in case it is useful to others. You can find complete details of available inputs and outputs, along with sample workflows in the README of the repo linked above.

Submission Category:

Maintainer Must-Haves

Yaml File or Link to Code

Here is a basic example workflow template that combines my generate-sitemap action with other actions. When content is pushed to the repository, generate-sitemap walks the directory structure, using last commit dates to determine when each page was last modified, skipping any html files that have noindex directives, and outputting an xml sitemap. Another action is then used to generate a pull request if the sitemap changed.

name: Generate xml sitemap

on:
  push:
    branches:
      - master

jobs:
  sitemap_job:
    runs-on: ubuntu-latest
    name: Generate a sitemap
    steps:
    - name: Checkout the repo
      uses: actions/checkout@v2
      with:
        fetch-depth: 0 
    - name: Generate the sitemap
      id: sitemap
      uses: cicirello/generate-sitemap@v1.5.0
      with:
        base-url-path: https://THE.URL.TO.YOUR.PAGE/
    - name: Create Pull Request
      uses: peter-evans/create-pull-request@v3
      with:
        title: "Automated sitemap update"
        body: > 
          Sitemap updated by the [generate-sitemap](https://github.com/cicirello/generate-sitemap) 
          GitHub action. Automated pull-request generated by the 
          [create-pull-request](https://github.com/peter-evans/create-pull-request) GitHub action.
Enter fullscreen mode Exit fullscreen mode

Here are a couple specific example workflows where I am using the generate-sitemap action.

The documentation for the Chips-n-Salsa library is maintained in the docs directory of its repository. This workflow gives an example of the inputs associated with a case like this where the website is not at the root of the repo:

name: Generate API sitemap

on:
  push:
    branches:
      - development

jobs:
  sitemap_job:
    runs-on: ubuntu-latest
    name: Generate a sitemap
    steps:
    - name: Checkout the repo
      uses: actions/checkout@v2
      with:
        fetch-depth: 0 
    - name: Generate the sitemap
      id: sitemap
      uses: cicirello/generate-sitemap@v1.5.0
      with:
        base-url-path: https://chips-n-salsa.cicirello.org/
        path-to-root: docs
    - name: Output stats
      run: |
        echo "sitemap-path = ${{ steps.sitemap.outputs.sitemap-path }}"
        echo "url-count = ${{ steps.sitemap.outputs.url-count }}"
        echo "excluded-count = ${{ steps.sitemap.outputs.excluded-count }}"
    - name: Create Pull Request
      uses: peter-evans/create-pull-request@v3.3.0
      with:
        title: "Automated sitemap update"
        body: > 
          Sitemap was updated by [generate-sitemap](https://github.com/cicirello/generate-sitemap) 
          GitHub action. Automated pull-request generated by 
          [create-pull-request](https://github.com/peter-evans/create-pull-request) GitHub action.
        commit-message: "[generate-sitemap] [create-pull-request] automated change."
        delete-branch: true
Enter fullscreen mode Exit fullscreen mode

I'm also using it for my personal website, which was my original intention, and utilizes this workflow:

name: Generate sitemap 

on:
  push:
    branches:
      - development

jobs:
  sitemap_job:
    runs-on: ubuntu-latest
    name: Generate a sitemap
    steps:
    - name: Checkout the repo
      uses: actions/checkout@v2
      with:
        fetch-depth: 0 
    - name: Generate the sitemap
      id: sitemap
      uses: cicirello/generate-sitemap@v1.5.0
      with:
        base-url-path: https://www.cicirello.org/
    - name: Output stats
      run: |
        echo "sitemap-path = ${{ steps.sitemap.outputs.sitemap-path }}"
        echo "url-count = ${{ steps.sitemap.outputs.url-count }}"
        echo "excluded-count = ${{ steps.sitemap.outputs.excluded-count }}"
    - name: Create Pull Request
      uses: peter-evans/create-pull-request@v3.3.0
      with:
        title: "Automated sitemap update"
        body: > 
          Sitemap was updated by [generate-sitemap](https://github.com/cicirello/generate-sitemap) 
          GitHub action. Automated pull-request generated by 
          [create-pull-request](https://github.com/peter-evans/create-pull-request) GitHub action.
        commit-message: "[generate-sitemap] [create-pull-request] automated change."
        delete-branch: true
Enter fullscreen mode Exit fullscreen mode

Additional Resources / Info

The only open source projects that I'm aware of currently using this workflow and the generate-sitemap action are two of my projects (for maintaining the documentation websites), in addition to my personal website. Those two projects are:

Top comments (0)