What it does
SEC Filings to Markdown converts EDGAR filings (10-K, 10-Q, 8-K, 13F) into clean, chunked Markdown built for retrieval-augmented generation. It resolves a ticker to the issuer, pulls filings live from official EDGAR, strips scripts and styles, converts the HTML to ATX Markdown, and splits each document into configurable word-sized chunks. Every chunk carries citation metadata — accession number and source URL — alongside issuer identity, form type, and filing date.
Who it's for
Teams building financial-research copilots and RAG systems over filings, quants and fundamental analysts loading 10-Ks into vector stores, compliance teams building searchable filing knowledge bases, and fintech products that need LLM-ready text with citations.
Sample fields / output
| Field | Description |
|---|---|
company / cik / ticker
|
Issuer identity |
form |
Filing type (10-K, 10-Q, 8-K, 13F) |
filingDate |
Date filed |
accessionNumber |
SEC accession number (citation) |
sourceUrl |
Direct link to the source document |
chunkIndex / totalChunks
|
Position within the filing |
markdown |
Clean Markdown chunk, ready for embedding |
Example use cases
- Build a financial-research copilot or RAG system over SEC filings
- Load 10-Ks and other filings into a vector store for fundamental analysis
- Stand up a searchable filing knowledge base for a compliance team
▶ Run SEC Filings to Markdown for RAG on Apify
Related actors
FAQ
Is the source data official?
Yes — filings are pulled live from official SEC EDGAR, no login required.
Why chunked Markdown instead of raw HTML?
Chunks are sized for embedding and each one is paired with citation metadata, so your LLM can cite the exact filing.
What does it cost?
Pay-per-event: $0.005 per run plus $0.04 per Markdown chunk.
Top comments (0)