The Problem
Google recently released a Workspace CLI with MCP support — so technically, your AI assistant can read a Google Doc now. But what it gets back is raw API JSON: a 500-line nested tree of StructuralElement objects, ParagraphElement arrays, and TextRun objects with style metadata buried three levels deep.
Your LLM can parse it. But it shouldn't have to.
Why Markdown Matters for LLMs
Think about what markdown gives an LLM that raw JSON doesn't:
-
# Headingtells it "this is a new section" — no need to infer structure from nested objects -
`bold*`* signals emphasis — in JSON, that's a
textStyle.bold: trueproperty buried inside aTextRunobject -
| tables |stay readable inline — in JSON, that's a matrix ofTableCellarrays insideTableRowarrays inside aTableobject -
- listsjust work — in JSON, you're parsingbullet.nestingLevelandlistProperties.nestingLevels[n].glyphType
An LLM can read markdown the way a human reads a document. JSON forces it to reconstruct the document from parts.
And the research backs this up. An arXiv study found up to 40% performance variance in LLM output depending on whether the input was plain text, Markdown, JSON, or YAML. A benchmark testing 11 table formats showed Markdown-KV hitting 60.7% accuracy — 16 points ahead of CSV and JSON alternatives. Markdown is also 10-15% more token-efficient than JSON for the same content, which adds up fast at scale.
Even the vendors agree: Anthropic recommends XML tags with markdown/plain text for document input. OpenAI recommends markdown formatting for code-related tasks. Google Gemini recommends markdown headings for structuring prompts.
The Solution
gdocs-to-md-mcp — a local MCP server that fetches Google Docs and converts them to clean markdown. Headings, bold, italic, tables, lists, links — all preserved.
One command to set up:
npx gdocs-to-md-mcp setup
The interactive wizard walks you through everything — GCP project selection, OAuth credentials, API enablement, Google sign-in — with clickable links that take you to the exact right page. No hunting through Cloud Console. It even auto-configures Claude Code when it's done.
How It Looks
Once set up, just paste a Google Docs URL:
"Read this doc and summarize the key decisions: https://docs.google.com/document/d/1abc.../edit"
Claude calls read_google_doc, gets clean structured markdown, and actually understands the document — sections, emphasis, tables, all of it. Compare that to the raw \n\n-delimited text blob from the default Google Drive integration. Night and day.
Three tools, nothing more:
-
read_google_doc— URL or doc ID in, markdown out -
search_google_docs— find docs by keyword -
list_recent_docs— see what's been updated recently
No 80-tool mega-server. No Gmail. No Calendar. Just Google Docs to markdown, done right.
"But doesn't Google have a Workspace CLI?"
Yes — Google recently released an official Workspace CLI that covers Drive, Docs, Sheets, and more. Their philosophy is "every response is structured JSON," which makes sense for programmatic tasks like updating spreadsheet cells or creating calendar events.
But for reading a document — which is a comprehension task, not an API-calling task — two things set gdocs-to-md-mcp apart:
Markdown, not JSON. Their CLI returns raw API JSON — a 500-line nested tree of
StructuralElementobjects. gdocs-to-md-mcp converts that into clean markdown with headings, tables, lists, and formatting preserved. LLMs can actually read it.Setup wizard. One command (
npx gdocs-to-md-mcp setup) walks you through OAuth credentials, API enablement, and MCP client configuration with clickable links. No Cloud Console scavenger hunts.
Different tools for different jobs. JSON when the agent needs to act on structured data. Markdown when it needs to read and understand content.
Try It
npx gdocs-to-md-mcp setup
Works with Claude Code, Cursor, Windsurf, or any MCP client.
Install globally with npm install -g gdocs-to-md-mcp or just run it directly with npx — your call.
GitHub: github.com/D4G4/gdocs-to-md-mcp
npm: npmjs.com/package/gdocs-to-md-mcp
Sources: Why Markdown > JSON for LLM Document Comprehension
- Does Prompt Formatting Have Any Impact on LLM Performance? — Up to 40% performance variance across formats
- MDEval: Evaluating Markdown Awareness in LLMs — 20K-instance benchmark on LLM markdown comprehension
- Which Table Format Do LLMs Understand Best? — Markdown-KV at 60.7% accuracy, 16 points ahead of CSV/JSON
- Markdown is 15% more token efficient than JSON — Real-world token comparison on identical content
- LLMs are bad at returning code in JSON — All tested models performed significantly worse with JSON
- Anthropic — Claude Prompting Best Practices — Recommends XML + markdown/plain text for document input
- OpenAI — Prompt Engineering Guide — Recommends markdown formatting for code tasks
- Google Gemini — Prompt Design Strategies — Recommends markdown headings for structuring prompts

Top comments (0)