Introducing URL to Markdown API
Are you tired of dealing with cluttered web content when all you need is clean, structured Markdown? Meet URL to Markdown API, an open-source FastAPI service that transforms various types of content — including web pages, YouTube videos, PDFs, and documents — into Markdown, making it perfect for AI/ML pipelines, content aggregation, and data processing workflows.
🔗 Try the live demo: markdown.nimk.ir
Example for a youtube video: https://markdown.nimk.ir/https://www.youtube.com/watch?v=fkZFCyp-iHs
Why Use URL to Markdown API?
Universal Content Support: Convert web articles, YouTube videos, PDFs, Office documents, and more
LLM-Optimized Output: Clean, structured Markdown perfect for AI/ML processing
Rich Media Handling: Extract metadata from images, audio files, and videos
Smart Processing: OCR for images, transcription for audio, and intelligent content extraction
Simple Integration: RESTful API with clear error handling and response codes
🔥 Live Demo
Want to see it in action? Just enter any URL and get instant Markdown output:
Example 1: Convert a Web Article
https://markdown.nimk.ir/https://ask.library.arizona.edu/faq/407985
This API call converts the article into structured Markdown.
Example 2: Convert a YouTube Video
https://markdown.nimk.ir/https://www.youtube.com/watch?v=dQw4w9WgXcQ
This extracts the video title, description, and metadata in Markdown format.
Key Features
Convert various content types:
Web articles & HTML content
YouTube videos
PDF documents
PowerPoint, Word, and Excel files
Images (with OCR and metadata extraction)
Audio files (with transcription & metadata)
CSV, JSON, XML
ZIP files (processes contents)
Automatic URL handling Clean error messages & proper HTTP status codes
Why Markdown for LLMs?
Markdown is ideal for Large Language Models (LLMs) because:
Structured Format: Clean hierarchical text structure
Consistent Representation: Normalizes different content types
Preserves Semantics: Headers, lists, and formatting are maintained
Reduced Noise: Removes unnecessary styling, keeping content-focused
Enhanced Accessibility: Easier for NLP models to process
Installation for Self hosted version
Standard Installation
git clone https://github.com/iw4p/url-to-markdown.git
cd url-to-markdown
pip install -r requirements.txt
Docker Deployment
Using Docker Compose (Recommended):
docker-compose up -d
This starts the service in detached mode on http://localhost:8000
Using Docker Directly:
docker build -t url-to-markdown .
docker run -d -p 8000:8000 url-to-markdown
API Usage
Start the API server:
uvicorn main:app --reload
Access it at http://localhost:8000
API Endpoints
Convert a URL to Markdown
GET /{url}
Example:
http://localhost:8000/www.youtube.com/watch?v=dQw4w9WgXcQ
Returns the video title, description, and metadata in Markdown format.
Response Format
Article Title
Content
[Article content in Markdown format]
Development
Built with:
FastAPI for the web framework
MarkItDown for content conversion
Python 3.12+
Join the Community
Check out the repo, contribute, and star it on GitHub! ⭐
🔗 GitHub Repo: https://Github.com/iw4p/url-to-markdown
Top comments (0)