I’ve been playing around with Pelican lately, using it to build my new Nimble Autonomy, LLC site (more on that soon).
So far, I like Pelican as a static site generator. It seems to strike a reasonable balance between generality and power. I previously used Hugo to build the Unit Circle Rekkids site. I found it reasonably decent, but not life-changing. That site’s content doesn’t change that often, so once it was built, I have only had to make an occasional tweak. This new site will be changing a bit more often.
To get some content on the new site, I wanted to republish some posts from this blog. Using the instructions for WordPress Export and Pelican-import, I was able to generate some markdown from my WordPress posts, but it was a bit underwhelming.
There was a lot of this kind of gunk in the markdown:
<!-- wp:paragraph -->`{=html}While I am an experienced video-conferencer and a reasonably experienced presenter, presenting to a remote audience is still something I am learning how to do. Having just given a talk this morning, I did want to share some things that are working well for me at the moment.`<!-- /wp:paragraph -->`{=html}`<!-- wp:heading -->`{=html}The Tools **---------** `<!-- /wp:heading -->`{=html}`<!-- wp:paragraph -->`{=html}
Only a few of my images were even referenced. I quickly realized that if I was going to try to move more than a handful of articles over, I was going to be spending a lot of time hand-editing the generated markdown.
This was an obvious problem that automation could fix. As I was using a python-based static-site generator, I decided to use python to do my cleanup. I’m sharing the code below as it may help others who are trying to solve the same problem. At some point, I might try to create a pull-request for it with Pelican, but right now I am just trying to move forward on other things.
It isn’t the best or cleanest python I’ve written, this was done quickly with a lot of iteration to catch all the corner cases. It could also be more pythonic. It is also very opinionated in the Markdown that it creates.
At some point, I may clean it up, but really I’m supplying it here because I have to believe that other people have hit the same problem and I want to save those folks some time.
Feel free to fork and improve!
Top comments (0)