Monitor Insider Trading Without Parsing SEC XML — Form 4 Data as Clean JSON
SEC Form 4 filings are one of the most useful public datasets for tracking what company insiders (CEOs, directors, 10% owners) are doing with their stock. When a CEO buys 50,000 shares of their own company, that's a signal.
The problem: getting this data from SEC EDGAR is genuinely painful.
The EDGAR Problem
EDGAR serves filings as nested XML/SGML documents. There's no proper REST API for structured Form 4 data. Here's what you're dealing with:
- CIK-based lookups: You need to map ticker symbols to CIK numbers (EDGAR's internal ID system)
- Full-text search returns everything: Searching for "AAPL" returns all 100+ filing types — 10-Ks, 8-Ks, proxies — not just insider trades
-
XML parsing: Form 4 filings use XBRL/XML with nested schemas. The actual transaction data is buried inside
<nonDerivativeTransaction>and<derivativeTransaction>elements - No pagination or filtering: EDGAR's XBRL feeds dump everything. You build the filtering logic
Most developers spend 2-3 days just getting the XML parsing right before they can extract a single transaction.
Skip the XML
I built an API that does all the EDGAR parsing and returns clean JSON. Here's what a Form 4 query looks like:
curl "https://your-api/sec/insider-trades?ticker=AAPL&limit=5"
{
"success": true,
"total": 1542,
"filings": [
{
"accession": "0000950123-26-001456",
"filedDate": "2026-03-31",
"periodOfReport": "2026-03-27",
"insider": {
"name": "Cook Timothy D",
"cik": "0001234567"
},
"company": {
"name": "APPLE INC",
"cik": "0000320193"
}
}
]
}
Get full transaction details for any filing:
curl "https://your-api/sec/insider-trades/filing/0000950123-26-001456"
{
"issuer": {
"cik": "0000320193",
"name": "APPLE INC",
"ticker": "AAPL"
},
"owner": {
"name": "Cook Timothy D",
"isDirector": true,
"isOfficer": true,
"title": "Chief Executive Officer"
},
"transactions": [
{
"security": "Common Stock",
"date": "2026-03-25",
"code": "P",
"codeLabel": "Purchase",
"shares": 50000,
"pricePerShare": 178.50,
"sharesAfter": 3500000,
"ownership": "direct"
}
]
}
Ticker-based search (no CIK mapping needed), pre-filtered to Form 4 only, transactions parsed into flat JSON.
Building an Insider Trading Monitor
Here's a Python script that checks for insider purchases above $100K:
import requests
from datetime import datetime, timedelta
API_URL = "https://your-api/sec/insider-trades"
API_KEY = "your-api-key"
WATCHLIST = ["AAPL", "MSFT", "GOOGL", "AMZN", "TSLA"]
headers = {"X-Api-Key": API_KEY}
yesterday = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
for ticker in WATCHLIST:
resp = requests.get(
API_URL,
params={"ticker": ticker, "startDate": yesterday},
headers=headers,
)
data = resp.json()
for filing in data.get("filings", []):
# Get full transaction details
detail = requests.get(
f"{API_URL}/filing/{filing['accession']}",
headers=headers,
).json()
for tx in detail.get("transactions", []):
if tx["code"] == "P": # Purchase
value = tx["shares"] * tx["pricePerShare"]
if value > 100_000:
print(
f"🚨 {detail['owner']['name']} "
f"({detail['owner'].get('title', 'Insider')}) "
f"bought {tx['shares']:,} shares of {ticker} "
f"at ${tx['pricePerShare']:.2f} "
f"(${value:,.0f} total)"
)
Run this on a daily cron and you've got an insider trading alert system. The code is ~30 lines because all the hard work (XML parsing, CIK resolution, filing filtering) is handled by the API.
Transaction Codes Explained
Form 4 transactions use single-letter codes:
| Code | Meaning | Signal |
|---|---|---|
| P | Open market purchase | Bullish — insider buying with own money |
| S | Open market sale | Could be planned (10b5-1) or discretionary |
| A | Grant/award | Compensation, not a market signal |
| M | Option exercise | Converting options to shares |
| F | Tax withholding | Automatic, not discretionary |
| G | Gift | Estate planning, not a trading signal |
The most interesting transactions are P (purchases) and S (sales) — these represent discretionary decisions by insiders.
What You'd Build Without This
For context, here's what the DIY version looks like:
- Build a CIK-to-ticker mapping table (SEC provides a bulk file, ~13,000 companies)
- Write an EDGAR full-text search query parser
- Build XML/SGML parsers for Form 4 documents (two different schemas depending on filing date)
- Handle XBRL footnotes, amendments (Form 4/A), and derivative transactions
- Implement rate limiting (SEC throttles to 10 req/sec with a required User-Agent header)
- Build storage and deduplication logic
That's a week of work minimum, plus ongoing maintenance when EDGAR changes their schema.
I vibe coded this for my own trading research. It's on RapidAPI — free tier if you want to poke around.
Top comments (0)