Every quarter, the largest institutional investors in the US are required by law to tell the SEC exactly what they're holding. These filings — called 13Fs — are public record. Warren Buffett's latest moves, Bridgewater's macro bets, Renaissance's quant positions: it's all there, 45 days after quarter-end, sitting in a government database.
The problem is that database is genuinely painful to work with. This post explains what 13F filings contain, why developers care about them, and how to actually pull clean, structured data from them in Python.
What Is a 13F Filing?
Section 13(f) of the Securities Exchange Act of 1934 requires institutional investment managers with more than $100 million in assets under management to disclose their equity holdings quarterly. The filing deadline is 45 days after each quarter ends (so Q4 data arrives by mid-February).
What's included:
- Every equity position (stocks, ETFs, call/put options) worth more than $200,000 or 10,000+ shares
- Share count and market value at quarter-end
- Whether the holding is long, short, or a hedged options position
What's not included:
- Bond positions, cash, foreign stocks listed only on non-US exchanges
- Positions added and closed within the same quarter
- Short positions (the SEC exempts these from disclosure)
So 13F data is incomplete by design — but it's still the most comprehensive public window into what large money managers are doing.
Why Developers Care About This Data
A few concrete use cases where developers end up reaching for 13F data:
Portfolio tracking tools. If you're building an app that lets retail investors "clone" a hedge fund's strategy, you need machine-readable positions updated every quarter.
Quant research. Tracking position changes across funds over time is a standard signal in quantitative factor models. Funds that are concentrating into a sector often precede price movements.
Fintech apps. Features like "see what the smart money is buying" require a reliable, normalized data source. Raw EDGAR is not that.
Competitive analysis for fund managers. Understanding what peers are holding — especially in concentrated positions — is standard practice.
The Raw EDGAR Problem
The SEC's EDGAR system is publicly accessible at https://www.sec.gov/cgi-bin/browse-edgar. In theory, you can just download 13F filings directly. In practice, it's a nightmare.
Here's what a raw EDGAR 13F XML document looks like:
<informationTable>
<infoTable>
<nameOfIssuer>APPLE INC</nameOfIssuer>
<titleOfClass>COM</titleOfClass>
<cusip>037833100</cusip>
<value>174318768</value>
<shrsOrPrnAmt>
<sshPrnamt>905560000</sshPrnamt>
<sshPrnamtType>SH</sshPrnamtType>
</shrsOrPrnAmt>
<investmentDiscretion>SOLE</investmentDiscretion>
<votingAuthority>
<Sole>905560000</Sole>
<Shared>0</Shared>
<None>0</None>
</votingAuthority>
</infoTable>
...
</informationTable>
Problems with working directly from EDGAR:
- Inconsistent formatting. Some filers use XML, some use older text formats, some use XBRL. The schema has changed over the years.
- No normalization. "APPLE INC", "Apple Inc.", and "APPLE COMPUTER INC" might appear as different issuers across different filings. CUSIP-based matching is required but CUSIPs change when companies restructure.
- Rate limits and robots.txt. EDGAR enforces strict rate limits (10 requests/second, with bans for abusive crawlers). You need exponential backoff, User-Agent headers, and careful throttling.
- No search. Finding the CIK (Central Index Key) for a given fund name is itself a lookup problem.
- Amendments. Funds can amend prior filings. You need to track which filing is the authoritative version.
Parsing this reliably across hundreds of filers, across years of filings, while keeping up with format changes, is a substantial engineering project.
Clean Data vs. Raw EDGAR
Here's the same Berkshire Hathaway Apple position from above, after going through the SEC EDGAR Financial Data API:
{
"issuer": "Apple Inc.",
"ticker": "AAPL",
"cusip": "037833100",
"shares": 905560000,
"market_value_usd": 174318768000,
"percentage_of_portfolio": 48.5,
"investment_discretion": "SOLE",
"change_from_prior_quarter": {
"shares_delta": -10000000,
"shares_delta_pct": -1.09,
"action": "decreased"
}
}
Ticker symbols resolved. Values in proper USD (EDGAR stores them in thousands). Quarter-over-quarter change calculated. That's what you actually need to build something with.
Getting Set Up
pip install edgar-client
The Python client wraps the SEC EDGAR Financial Data API and handles auth, rate limiting, and response parsing.
from edgar_client import EdgarClient
client = EdgarClient(api_key="your_rapidapi_key")
Example: What Is Berkshire Hathaway Holding Right Now?
Berkshire's CIK is 0001067983. You can look up any fund's CIK on the EDGAR company search page.
from edgar_client import EdgarClient
client = EdgarClient(api_key="your_rapidapi_key")
# Get the latest 13F filing for Berkshire Hathaway
holdings = client.get_13f_holdings(cik="0001067983")
print(f"Filing date: {holdings.filed_date}")
print(f"Period of report: {holdings.period_of_report}")
print(f"Total portfolio value: ${holdings.total_value_usd:,.0f}")
print(f"Number of positions: {len(holdings.positions)}\n")
# Print top 10 holdings by portfolio weight
top_10 = sorted(holdings.positions, key=lambda p: p.percentage_of_portfolio, reverse=True)[:10]
for pos in top_10:
print(f"{pos.ticker or pos.issuer:<8} {pos.percentage_of_portfolio:>6.1f}% ${pos.market_value_usd/1e9:>8.2f}B")
Output (approximate, based on recent filings):
Filing date: 2025-02-14
Period of report: 2024-12-31
Total portfolio value: $296,000,000,000
Number of positions: 44
AAPL 48.5% $174.32B
BAC 9.8% $29.01B
AXP 8.1% $23.98B
KO 7.0% $20.72B
CVX 4.7% $13.91B
OXY 4.2% $12.43B
KHC 3.2% $9.47B
MCO 2.8% $8.29B
DVA 1.5% $4.44B
VRSN 0.8% $2.37B
Tracking Quarterly Changes
One of the most useful analyses is watching how positions change quarter to quarter. This is where you find the signal.
from edgar_client import EdgarClient
client = EdgarClient(api_key="your_rapidapi_key")
# Get the last two filings to compare
filings = client.list_13f_filings(cik="0001067983", limit=2)
current = client.get_13f_holdings_by_accession(filings[0].accession_number)
prior = client.get_13f_holdings_by_accession(filings[1].accession_number)
# Build a lookup from prior quarter
prior_lookup = {p.cusip: p for p in prior.positions}
print(f"Changes from {prior.period_of_report} to {current.period_of_report}\n")
print(f"{'Ticker':<8} {'Action':<12} {'Shares Change':>15} {'% Change':>10}")
print("-" * 50)
for pos in sorted(current.positions, key=lambda p: abs(p.change_from_prior_quarter.shares_delta or 0), reverse=True)[:15]:
change = pos.change_from_prior_quarter
if change and change.action != "unchanged":
print(f"{pos.ticker or pos.issuer[:7]:<8} {change.action:<12} {change.shares_delta:>+15,.0f} {change.shares_delta_pct:>+10.1f}%")
Comparing Two Funds Side by Side
from edgar_client import EdgarClient
client = EdgarClient(api_key="your_rapidapi_key")
# CIK numbers for two funds
funds = {
"Berkshire Hathaway": "0001067983",
"Pershing Square": "0001336528",
}
fund_holdings = {}
for name, cik in funds.items():
holdings = client.get_13f_holdings(cik=cik)
fund_holdings[name] = {p.cusip: p for p in holdings.positions}
# Find overlapping positions
all_cusips = set()
for holdings in fund_holdings.values():
all_cusips.update(holdings.keys())
print(f"{'Ticker':<8}", end="")
for name in funds:
print(f" {name[:18]:>18}", end="")
print()
print("-" * 50)
for cusip in all_cusips:
row = []
ticker = None
for name, holdings in fund_holdings.items():
if cusip in holdings:
pos = holdings[cusip]
ticker = ticker or pos.ticker or pos.issuer[:8]
row.append(f"{pos.percentage_of_portfolio:>17.1f}%")
else:
row.append(f"{'—':>18}")
if ticker:
print(f"{ticker:<8}", end="")
for cell in row:
print(cell, end="")
print()
CIK Reference: Major Institutional Investors
Here are CIK numbers for commonly tracked funds to get you started:
| Fund | CIK |
|---|---|
| Berkshire Hathaway | 0001067983 |
| Bridgewater Associates | 0001350694 |
| Citadel Advisors | 0001423298 |
| Renaissance Technologies | 0001037389 |
| Two Sigma Investments | 0001450144 |
| D.E. Shaw | 0001009207 |
| Tiger Global Management | 0001167483 |
| Pershing Square | 0001336528 |
| Appaloosa Management | 0001070154 |
| Druckenmiller / Duquesne | 0001536411 |
| ARK Investment Management | 0001579982 |
Note: Some funds manage multiple entities and file separately for each. Bridgewater and Citadel in particular have several related CIKs.
What to Build With This
A few directions worth exploring:
- "Clone a fund" portfolio tracker — Build a UI that lets users select a fund and see its current positions, updated each quarter automatically
- Concentration analysis — Alert when a fund makes a position its top 5 holding (historically correlated with conviction buys)
- Crowding detector — Flag when the same stock appears as a top-10 holding across 10+ funds (crowded trades tend to unwind badly)
- Sector rotation tracker — Watch how fund-level sector weights shift quarter to quarter
Getting Started
Install the client:
pip install edgar-client
Full source, including async support and examples for options positions and amended filings: github.com/dapdevsoftware/edgar-python
The API is available on RapidAPI: SEC EDGAR Financial Data API. There's a free tier you can use to pull a few filings and validate the data format before building anything serious.
If you're building a fintech tool or quant research pipeline on top of this and run into questions about the data structure — particularly around amended filings or options positions — drop a comment. These have some non-obvious edge cases worth discussing.
Top comments (0)