Here's a quick tool that landed in my queue recently: microsoft/markitdown
It's a Python CLI that converts PDFs, Word docs, PowerPoint, and Excel files to Markdown. Not groundbreaking, but if you've ever had to process a folder of legacy documentation for a static site, you know the value of not doing it manually.
Two things I found useful:
Batch conversion with piping
markitdown --input document.docx --output converted/
You can point it at a directory and it processes everything in one shot. Combine with standard Unix tools:
find ./legacy-docs -name '*.docx' | xargs -I{} sh -c 'markitdown --input {} --output ./md/'
stdout output for scripting
markitdown document.pdf
Dumps the markdown to stdout, which makes it easy to pipe into other text processing or redirect to specific filenames based on the input.
It's on PyPI (pip install markitdown), so it'll drop into a CI pipeline without much friction. If you've got a documentation migration on your plate and you're tired of manual conversions, it's worth a look.
Top comments (0)