DEV Community

Cover image for Markdown Processing in AI Applications with mq-mcp
Takahiro Sato
Takahiro Sato

Posted on

Markdown Processing in AI Applications with mq-mcp

AI assistants frequently work with Markdown content from documentation, blogs, and technical articles. Processing this content programmatically requires extracting specific elements, transforming structure, and filtering based on criteria. The Model Context Protocol (MCP) provides a standardized way for AI systems to access external tools and data sources.

mq-mcp combines the Markdown processing capabilities of the mq tool with MCP, allowing AI assistants to perform content analysis through a standard interface.

Model Context Protocol Integration

MCP enables AI assistants to interact with external tools using JSON-RPC 2.0. mq-mcp implements an MCP server that exposes four primary tools:

  • html_to_markdown: Converts HTML to Markdown with optional query processing
  • extract_markdown: Processes Markdown content using mq queries
  • available_functions: Lists all available mq functions with descriptions
  • available_selectors: Returns available element selectors

Configuration Setup

// Claude Desktop
{
  "mcpServers": {
    "mq": {
      "command": "/path/to/mq",
      "args": ["mcp"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode
# Claude Code
claude mcp add mq-mcp -- mq mcp
Enter fullscreen mode Exit fullscreen mode

Content Processing Examples

Extracting Code Blocks

{
  "markdown": "# API Reference\n\n```

javascript\nconst api = require('./api');\n

```\n\n```

python\nimport api\n

```",
  "query": ".code(\"javascript\")"
}
Enter fullscreen mode Exit fullscreen mode

Filtering Headings

{
  "markdown": "# Main Title\n## Section A\n### Subsection\n## Section B",
  "query": "select(or(.h1, .h2))"
}
Enter fullscreen mode Exit fullscreen mode

Link Extraction

{
  "markdown": "See [API docs](./api.md) and [external guide](https://example.com/guide)",
  "query": ".link | select(startsWith(\"./\"))"
}
Enter fullscreen mode Exit fullscreen mode

Advanced Processing

// Transform text
{
  "markdown": "# getting started\n## basic usage",
  "query": ".h | upcase()"
}

// Filter with conditions
{
  "markdown": "```

bash\nnpm install\n

```\n\n```

bash\nyarn add\n

```",
  "query": ".code | select(contains(\"npm\"))"
}
Enter fullscreen mode Exit fullscreen mode

HTML to Markdown Conversion

The html_to_markdown function provides comprehensive HTML processing with optional query filtering. This enables AI systems to extract structured content from web pages, documentation sites, and HTML-based content.

Basic HTML Conversion

Convert HTML to clean Markdown format:

{
  "html": "<div><h1>API Documentation</h1><p>This describes the REST API endpoints.</p><ul><li>GET /users</li><li>POST /users</li></ul></div>"
}
Enter fullscreen mode Exit fullscreen mode

Output:

# API Documentation

This describes the REST API endpoints.

- GET /users
- POST /users
Enter fullscreen mode Exit fullscreen mode

Complex HTML Structure Processing

Handle nested HTML with tables and formatted content:

{
  "html": "<article><header><h1>Database Schema</h1></header><table><tr><th>Column</th><th>Type</th></tr><tr><td>id</td><td>INTEGER</td></tr><tr><td>name</td><td>VARCHAR</td></tr></table><footer><p>Updated: 2024</p></footer></article>",
  "query": "select(or(.h1, .table))"
}
Enter fullscreen mode Exit fullscreen mode

Output:

# Database Schema

| Column | Type    |
| ------ | ------- |
| id     | INTEGER |
| name   | VARCHAR |
Enter fullscreen mode Exit fullscreen mode

Web Scraping Integration

Extract content from live web pages:

{
  "html": "<div class='documentation'><h2>Installation</h2><pre><code>npm install package</code></pre><h2>Usage</h2><pre><code>const pkg = require('package');</code></pre></div>",
  "query": "nodes | .code | first()"
}
Enter fullscreen mode Exit fullscreen mode

Output:

npm install package
Enter fullscreen mode Exit fullscreen mode

This extracts only the installation code block, filtering out usage examples.

HTML Element Mapping

The conversion process maps HTML elements to Markdown equivalents:

HTML Element Markdown Output Notes
<h1> - <h6> # - ###### Header levels preserved
<p> Text blocks Paragraph spacing maintained
<ul>, <ol> -, 1. List structure preserved
<table> Pipe tables Column alignment detected
<code> `code` Inline code formatting
<pre><code> code blocks Code blocks with language detection
<blockquote> > Quote block formatting
<a> [text](url) Links with titles preserved
<img> ![alt](src) Images with alt text

Processing Pipeline

The HTML to Markdown conversion follows this sequence:

  1. HTML Parsing: DOM tree construction from input HTML
  2. Structure Analysis: Identify semantic elements and nesting
  3. Markdown Generation: Convert elements to Markdown syntax
  4. Query Processing: Apply mq queries to resulting structure
  5. Output Formatting: Return filtered and formatted content

Installation

Install the mq CLI tool which includes the MCP server:

# Using Cargo
cargo install --git https://github.com/harehare/mq.git mq-cli

# Using Homebrew
brew install harehare/tap/mq

# Direct binary download
curl -L https://github.com/harehare/mq/releases/latest/download/mq-linux-x86_64 -o mq
chmod +x mq
Enter fullscreen mode Exit fullscreen mode

Integration Benefits

  • Structured Processing: jq-like queries enable precise content extraction
  • Format Flexibility: Handles both HTML and Markdown inputs
  • Discovery Features: Built-in function and selector documentation

Resources

Support

Top comments (0)