New: Patents to Markdown for RAG — turn US/EP/WO patents into clean, chunked Markdown for RAG

What it does

Patents to Markdown for RAG pulls patents from Google Patents (US, EP, and WO families) and converts them into clean, chunked Markdown ready for retrieval-augmented generation and LLM pipelines. It extracts the abstract, claims, and full description, then segments the text into token-sized chunks so you can embed and index without any HTML or PDF cleanup.

Who it's for

Built for AI engineers building patent search, IP analysts assembling prior-art corpora, and legal-tech teams who need patent text in a format an LLM can actually consume.

Sample fields / output

patent_number
title
abstract
claims
description
assignee
inventors
filing_date
publication_date
jurisdiction
markdown
chunk_id
token_count

Example use cases

Build a prior-art RAG knowledge base for a patent-search assistant.
Feed chunked claims and descriptions into an embeddings index for semantic search.
Generate LLM-ready context for freedom-to-operate and invalidity analysis.

Try Patents to Markdown for RAG on Apify»