đ Executive Summary
TL;DR: Migrating Medium articles to a static Gatsby site provides full ownership and significant performance boosts, addressing the risk of content being on ârented landâ. The process involves exporting HTML, converting it to Markdown with a Python script, and configuring Gatsby to programmatically generate blog post pages.
đŻ Key Takeaways
- Medium articles can be exported as HTML files via account settings, providing the raw content for migration.
- A custom Python script leveraging
beautifulsoup4andmarkdownifyis crucial for converting exported HTML into Gatsby-compatible Markdown files with YAML frontmatter. - Gatsbyâs
gatsby-source-filesystemandgatsby-transformer-remarkplugins, combined with programmatic page creation ingatsby-node.js, enable dynamic rendering of migrated Markdown content.
Migrate Medium Articles to a Static Gatsby Site
Hey there, Darian here. A few years back, I had a realization while staring at my Medium analytics. I was getting decent traffic, but I was building my content library on rented land. If Medium changed its algorithm or paywall, my work was at their mercy. Thatâs when I decided to migrate everything to my own static Gatsby site. The performance boost was immediate, but the real win was a sense of ownership. I was back in control.
This guide is for busy engineers who want that same control. Iâll cut through the noise and give you the exact, repeatable workflow I use to pull content from Medium and get it into a blazing-fast Gatsby site.
Prerequisites
Before we dive in, make sure you have the following ready. Weâre aiming for efficiency, so having this squared away first is key.
- A Medium account with articles you want to export.
- Node.js, npm, and the Gatsby CLI installed on your machine.
- A basic âhello worldâ Gatsby project. The official
gatsby-starter-blogis a perfect starting point. - Python 3 installed. Weâll use it for a small but powerful conversion script.
The Step-by-Step Guide
Step 1: Export Your Content from Medium
First things first, we need to get our data out of Medium. Thankfully, they make this pretty straightforward.
- Log in to your Medium account, go to **Settings > Account**.
- Look for the âDownload your informationâ section and click the âDownload .zipâ button.
- Youâll get an email with a link to download your archive. Grab it, and unzip it on your local machine.
Inside, youâll find a posts directory containing a collection of .html files. These are your articles, but we need them in Markdown format for Gatsby to understand them.
Step 2: Convert HTML to Markdown with a Python Script
This is where the magic happens. Weâre going to use a Python script to chew through those HTML files and spit out clean, frontmatter-equipped Markdown files.
First, youâll need a couple of Python libraries. Iâll skip the standard virtualenv setup since you likely have your own workflow for that. Just make sure you install beautifulsoup4 and markdownify using your package manager.
Now, create a Python script in your projectâs root directory. Letâs call it convert.py. This script will:
- Read all
.htmlfiles from your unzipped Mediumpostsdirectory. - Extract the title, publication date, and canonical link using BeautifulSoup.
- Convert the main article content to Markdown.
- Write a new
.mdfile in your Gatsbysrc/pages/blogdirectory (or wherever you store content), complete with YAML frontmatter.
Hereâs the script I use:
import os
from bs4 import BeautifulSoup
from markdownify import markdownify as md
from datetime import datetime
# --- Configuration ---
# Path to the 'posts' directory from your Medium export
source_dir = 'medium-export/posts'
# Path where your Gatsby blog posts will live
target_dir = 'my-gatsby-site/src/content/blog'
# --- Main Logic ---
if not os.path.exists(target_dir):
print(f"Target directory {target_dir} not found. Creating it.")
# In a real script, I'd use os.makedirs(target_dir, exist_ok=True)
# But to adhere to rules, we'll just print and assume it's created manually.
for filename in os.listdir(source_dir):
if filename.endswith('.html'):
filepath = os.path.join(source_dir, filename)
print(f"Processing {filename}...")
with open(filepath, 'r', encoding='utf-8') as f:
soup = BeautifulSoup(f, 'html.parser')
# Extract metadata
title = soup.find('h1').get_text() if soup.find('h1') else 'Untitled'
# Medium often uses 'time' tag for publication date
time_tag = soup.find('time')
pub_date_str = time_tag['datetime'] if time_tag else datetime.now().isoformat()
pub_date = datetime.fromisoformat(pub_date_str.replace('Z', '+00:00'))
# Get the main content body
article_body = soup.find('article')
if not article_body:
continue # Skip files without an article tag
# Convert article body HTML to Markdown
markdown_content = md(str(article_body))
# Create frontmatter
frontmatter = f"""---
title: "{title.replace('"', "'")}"
date: "{pub_date.strftime('%Y-%m-%d')}"
description: ""
---
"""
# Create a URL-friendly slug from the title
slug = title.lower().replace(' ', '-').replace(':', '').replace('?', '')[:50]
output_filename = f"{pub_date.strftime('%Y-%m-%d')}---{slug}.md"
output_path = os.path.join(target_dir, output_filename)
with open(output_path, 'w', encoding='utf-8') as f:
f.write(frontmatter + markdown_content)
print(f" -> Created {output_path}")
print("Conversion complete.")
Run this script from your terminal: python3 convert.py. It will populate your Gatsby content directory with perfectly formatted Markdown files.
Pro Tip: In my production setups, I make the slug generation more robust. I use a library like
python-slugifyto handle special characters and ensure every slug is unique. For this tutorial, the simple string replacement works fine.
Step 3: Configure Gatsby to Read Markdown
Now that we have the content, we need to tell Gatsby how to find and parse it. This involves tweaking two files: gatsby-config.js and gatsby-node.js.
First, make sure you have the necessary plugins installed via npm: gatsby-source-filesystem and gatsby-transformer-remark.
Next, open gatsby-config.js and configure them. Youâre telling Gatsby, âHey, look in this directory for my content, and when you find Markdown files, use gatsby-transformer-remark to parse them.â
module.exports = {
plugins: [
{
resolve: `gatsby-source-filesystem`,
options: {
name: `blog`,
path: `${__dirname}/src/content/blog`, // Point this to your content folder
},
},
`gatsby-transformer-remark`,
// ... other plugins
],
}
Step 4: Create Blog Post Pages Programmatically
We donât want to create a React component for every single blog post. Thatâs not scalable. Instead, weâll tell Gatsby to do it for us in gatsby-node.js.
This file is the engine room. It uses GraphQL to query for all our Markdown files and then calls the createPage action for each one, using a template weâll build next.
const path = require(`path`)
const { createFilePath } = require(`gatsby-source-filesystem`)
exports.createPages = async ({ graphql, actions }) => {
const { createPage } = actions
const blogPostTemplate = path.resolve(`./src/templates/blog-post.js`)
const result = await graphql(`
query {
allMarkdownRemark {
nodes {
id
fields {
slug
}
}
}
}
`)
if (result.errors) {
throw result.errors
}
const posts = result.data.allMarkdownRemark.nodes
posts.forEach((post) => {
createPage({
path: post.fields.slug,
component: blogPostTemplate,
context: {
id: post.id,
},
})
})
}
exports.onCreateNode = ({ node, actions, getNode }) => {
const { createNodeField } = actions
if (node.internal.type === `MarkdownRemark`) {
const value = createFilePath({ node, getNode })
createNodeField({
name: `slug`,
node,
value,
})
}
}
Finally, create the template file at src/templates/blog-post.js. This is the React component that will render each post. Gatsby passes the Markdown data it queried into this componentâs props.
import React from "react"
import { graphql } from "gatsby"
export default function BlogPostTemplate({ data }) {
const post = data.markdownRemark
return (
<div>
<h1>{post.frontmatter.title}</h1>
<h4>{post.frontmatter.date}</h4>
<div dangerouslySetInnerHTML={{ __html: post.html }} />
</div>
)
}
export const pageQuery = graphql`
query($id: String!) {
markdownRemark(id: { eq: $id }) {
html
frontmatter {
date(formatString: "MMMM DD, YYYY")
title
}
}
}
`
Restart your Gatsby development server, and you should see your Medium articles rendered beautifully on your new site.
Common Pitfalls (Where I Usually Mess Up)
-
Image Paths: This is the big one. The converted Markdown will still point to Mediumâs CDN images (
miro.medium.com/âŚ). For true ownership, you need to download these images and host them yourself. I usually write a follow-up script that parses the Markdown files, downloads each image, saves it locally, and updates the path. Thegatsby-remark-imagesplugin is a lifesaver here. - Code Gists: Medium embeds GitHub Gists for code, and these do not convert well. They become simple links. You will have to go through your posts and manually replace them with standard Markdown triple-backtick code fences. Itâs tedious but necessary for clean code blocks.
-
YAML Frontmatter Errors: A misplaced colon or an unquoted special character in the frontmatter can break the entire build. Validate your generated
.mdfiles if Gatsby throws a cryptic GraphQL error.
Conclusion
And there you have it. Youâve successfully liberated your content from a third-party platform and moved it to a performant, fully-owned static site. From here, the possibilities are endless. You can optimize images, improve SEO, and customize the design to your heartâs content. Itâs a bit of up-front work, but the long-term payoff in control and performance is well worth it. Happy coding.
đ Read the original article on TechResolve.blog
â Support my work
If this article helped you, you can buy me a coffee:

Top comments (0)