DEV Community

Cover image for Export Medium posts as Markdown
Jesse Portnoy
Jesse Portnoy

Posted on • Edited on

2

Export Medium posts as Markdown

First of all, why? Well, In my case, I write on Medium with the, albeit unlikely, hope that one day, I may be popular enough to generate a modest revenue through its partner programme (while we’re on that subject, if you like my posts, it would be nice if you followed me but don’t feel obligated). However, I also have my own site — packman.io, which is based on Jekyll and also has a blog section.

For those that don’t know Jekyll, it is a static site generator written in Ruby and distributed under the open source MIT license. If you need a portfolio/blog/documentation website, I strongly recommend you give it a go. I intend to write a post about how I make use of it to generate my own site soon but for now, suffice it to say that if a user landed on https://packman.io, I don’t want to direct them away from it by sending them to read my posts on Medium. Besides, my site supports both light and dark mode, which I think is very important because white backgrounds really hurt my eyes (by the way — if you’re like me, I’d also recommend Dark Reader, for all those inconsiderate sites that do not support dark mode natively).

Jekyll takes Markdown (MD) files as input and, using a templating mechanism, produces HTML files out of them. And so, I’ve written the below small script to fetch my Medium content and convert it to MD files Jekyll can do its magic on and, without further ado, here it is, with the hope that it will be of use to you as well:

require 'feedjira'
require 'httparty'
require 'nokogiri'
require 'reverse_markdown'
require 'fileutils'

if ARGV.length < 2
    puts "Usage: " + __FILE__ + " <medium user without the '@'> </path/to/output>"
    exit 1
end

medium_user = ARGV[0]
output_dir = ARGV[1]

FileUtils.mkdir_p(output_dir)

xml = HTTParty.get("https://medium.com/feed/@#{medium_user}").body
feed = Feedjira.parse(xml)

feed.entries.each do |e|
    # normalise `title` to arrive at a reasonable filename
    published_date = e.published.strftime("%Y-%m-%d")
    filename = output_dir + '/' + published_date + '-' + e.title.gsub(/[^0-9a-z\s]/i, '').gsub(/\s+/,'-') + '.md'
    if File.exists?(filename)
    puts "#{filename} already exists. Skipping.."
    next
    end

    content = e.content
    parseHTML = Nokogiri::HTML(content)
    img = parseHTML.xpath("//img")[0]['src'].sub!(/http(s)?:/,'')

    # Medium feed includes the hero image in the `content` field. Since Jekyll and other systems will probably render the hero image separately, remove it from the HTML before generating the Markdown
    content.sub!(/<figure><img\salt="([\w\.\-])?"\ssrc="https:\/\/cdn-images-1.medium.com\/max\/[0-9]+\/[0-9]\*[0-9a-zA-Z._-]+"\s\/>(\<figcaption\>.*\<\/figcaption\>)?<\/figure>/, '')

result = ReverseMarkdown.convert(content).gsub(/\\n/,"\n")
    meta = <<-META
---
layout: post
author: #{e.author}
title: "#{e.title}"
date: #{e.published}
background: #{img}
---

    META

    File.write(filename, meta + result)
end
Enter fullscreen mode Exit fullscreen mode

If you want to download it rather than copy and paste, it’s available from GitLab as well.

Invoke it like so:

./medium_to_md.rb <medium user without the '@'> </path/to/output>
Enter fullscreen mode Exit fullscreen mode

It will generate a clean markdown file that includes the metadata (front matter in Jekyll terminology) from the original Medium post; i.e:

---
layout: post
author: Jesse Portnoy
title: Capture your users attention with style
date: 2023-04-23 20:23:44 UTC
background: //cdn-images-1.medium.com/max/1024/1*TlDFO_bhcRPJDMxEceyeyw.png
---
Enter fullscreen mode Exit fullscreen mode

May the source be with you,

Do your career a big favor. Join DEV. (The website you're on right now)

It takes one minute, it's free, and is worth it for your career.

Get started

Community matters

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Immerse yourself in a wealth of knowledge with this piece, supported by the inclusive DEV Community—every developer, no matter where they are in their journey, is invited to contribute to our collective wisdom.

A simple “thank you” goes a long way—express your gratitude below in the comments!

Gathering insights enriches our journey on DEV and fortifies our community ties. Did you find this article valuable? Taking a moment to thank the author can have a significant impact.

Okay