DEV Community

DAPDEV
DAPDEV

Posted on

How to Track What Hedge Funds Are Buying and Selling (Using SEC 13F Data)

Every quarter, the largest institutional investors in the US are required by law to tell the SEC exactly what they're holding. These filings — called 13Fs — are public record. Warren Buffett's latest moves, Bridgewater's macro bets, Renaissance's quant positions: it's all there, 45 days after quarter-end, sitting in a government database.

The problem is that database is genuinely painful to work with. This post explains what 13F filings contain, why developers care about them, and how to actually pull clean, structured data from them in Python.


What Is a 13F Filing?

Section 13(f) of the Securities Exchange Act of 1934 requires institutional investment managers with more than $100 million in assets under management to disclose their equity holdings quarterly. The filing deadline is 45 days after each quarter ends (so Q4 data arrives by mid-February).

What's included:

  • Every equity position (stocks, ETFs, call/put options) worth more than $200,000 or 10,000+ shares
  • Share count and market value at quarter-end
  • Whether the holding is long, short, or a hedged options position

What's not included:

  • Bond positions, cash, foreign stocks listed only on non-US exchanges
  • Positions added and closed within the same quarter
  • Short positions (the SEC exempts these from disclosure)

So 13F data is incomplete by design — but it's still the most comprehensive public window into what large money managers are doing.


Why Developers Care About This Data

A few concrete use cases where developers end up reaching for 13F data:

Portfolio tracking tools. If you're building an app that lets retail investors "clone" a hedge fund's strategy, you need machine-readable positions updated every quarter.

Quant research. Tracking position changes across funds over time is a standard signal in quantitative factor models. Funds that are concentrating into a sector often precede price movements.

Fintech apps. Features like "see what the smart money is buying" require a reliable, normalized data source. Raw EDGAR is not that.

Competitive analysis for fund managers. Understanding what peers are holding — especially in concentrated positions — is standard practice.


The Raw EDGAR Problem

The SEC's EDGAR system is publicly accessible at https://www.sec.gov/cgi-bin/browse-edgar. In theory, you can just download 13F filings directly. In practice, it's a nightmare.

Here's what a raw EDGAR 13F XML document looks like:

<informationTable>
  <infoTable>
    <nameOfIssuer>APPLE INC</nameOfIssuer>
    <titleOfClass>COM</titleOfClass>
    <cusip>037833100</cusip>
    <value>174318768</value>
    <shrsOrPrnAmt>
      <sshPrnamt>905560000</sshPrnamt>
      <sshPrnamtType>SH</sshPrnamtType>
    </shrsOrPrnAmt>
    <investmentDiscretion>SOLE</investmentDiscretion>
    <votingAuthority>
      <Sole>905560000</Sole>
      <Shared>0</Shared>
      <None>0</None>
    </votingAuthority>
  </infoTable>
  ...
</informationTable>
Enter fullscreen mode Exit fullscreen mode

Problems with working directly from EDGAR:

  1. Inconsistent formatting. Some filers use XML, some use older text formats, some use XBRL. The schema has changed over the years.
  2. No normalization. "APPLE INC", "Apple Inc.", and "APPLE COMPUTER INC" might appear as different issuers across different filings. CUSIP-based matching is required but CUSIPs change when companies restructure.
  3. Rate limits and robots.txt. EDGAR enforces strict rate limits (10 requests/second, with bans for abusive crawlers). You need exponential backoff, User-Agent headers, and careful throttling.
  4. No search. Finding the CIK (Central Index Key) for a given fund name is itself a lookup problem.
  5. Amendments. Funds can amend prior filings. You need to track which filing is the authoritative version.

Parsing this reliably across hundreds of filers, across years of filings, while keeping up with format changes, is a substantial engineering project.


Clean Data vs. Raw EDGAR

Here's the same Berkshire Hathaway Apple position from above, after going through the SEC EDGAR Financial Data API:

{
  "issuer": "Apple Inc.",
  "ticker": "AAPL",
  "cusip": "037833100",
  "shares": 905560000,
  "market_value_usd": 174318768000,
  "percentage_of_portfolio": 48.5,
  "investment_discretion": "SOLE",
  "change_from_prior_quarter": {
    "shares_delta": -10000000,
    "shares_delta_pct": -1.09,
    "action": "decreased"
  }
}
Enter fullscreen mode Exit fullscreen mode

Ticker symbols resolved. Values in proper USD (EDGAR stores them in thousands). Quarter-over-quarter change calculated. That's what you actually need to build something with.


Getting Set Up

pip install edgar-client
Enter fullscreen mode Exit fullscreen mode

The Python client wraps the SEC EDGAR Financial Data API and handles auth, rate limiting, and response parsing.

from edgar_client import EdgarClient

client = EdgarClient(api_key="your_rapidapi_key")
Enter fullscreen mode Exit fullscreen mode

Example: What Is Berkshire Hathaway Holding Right Now?

Berkshire's CIK is 0001067983. You can look up any fund's CIK on the EDGAR company search page.

from edgar_client import EdgarClient

client = EdgarClient(api_key="your_rapidapi_key")

# Get the latest 13F filing for Berkshire Hathaway
holdings = client.get_13f_holdings(cik="0001067983")

print(f"Filing date: {holdings.filed_date}")
print(f"Period of report: {holdings.period_of_report}")
print(f"Total portfolio value: ${holdings.total_value_usd:,.0f}")
print(f"Number of positions: {len(holdings.positions)}\n")

# Print top 10 holdings by portfolio weight
top_10 = sorted(holdings.positions, key=lambda p: p.percentage_of_portfolio, reverse=True)[:10]

for pos in top_10:
    print(f"{pos.ticker or pos.issuer:<8} {pos.percentage_of_portfolio:>6.1f}%  ${pos.market_value_usd/1e9:>8.2f}B")
Enter fullscreen mode Exit fullscreen mode

Output (approximate, based on recent filings):

Filing date: 2025-02-14
Period of report: 2024-12-31
Total portfolio value: $296,000,000,000
Number of positions: 44

AAPL      48.5%  $174.32B
BAC        9.8%   $29.01B
AXP        8.1%   $23.98B
KO         7.0%   $20.72B
CVX        4.7%   $13.91B
OXY        4.2%   $12.43B
KHC        3.2%    $9.47B
MCO        2.8%    $8.29B
DVA        1.5%    $4.44B
VRSN       0.8%    $2.37B
Enter fullscreen mode Exit fullscreen mode

Tracking Quarterly Changes

One of the most useful analyses is watching how positions change quarter to quarter. This is where you find the signal.

from edgar_client import EdgarClient

client = EdgarClient(api_key="your_rapidapi_key")

# Get the last two filings to compare
filings = client.list_13f_filings(cik="0001067983", limit=2)
current = client.get_13f_holdings_by_accession(filings[0].accession_number)
prior = client.get_13f_holdings_by_accession(filings[1].accession_number)

# Build a lookup from prior quarter
prior_lookup = {p.cusip: p for p in prior.positions}

print(f"Changes from {prior.period_of_report} to {current.period_of_report}\n")
print(f"{'Ticker':<8} {'Action':<12} {'Shares Change':>15} {'% Change':>10}")
print("-" * 50)

for pos in sorted(current.positions, key=lambda p: abs(p.change_from_prior_quarter.shares_delta or 0), reverse=True)[:15]:
    change = pos.change_from_prior_quarter
    if change and change.action != "unchanged":
        print(f"{pos.ticker or pos.issuer[:7]:<8} {change.action:<12} {change.shares_delta:>+15,.0f} {change.shares_delta_pct:>+10.1f}%")
Enter fullscreen mode Exit fullscreen mode

Comparing Two Funds Side by Side

from edgar_client import EdgarClient

client = EdgarClient(api_key="your_rapidapi_key")

# CIK numbers for two funds
funds = {
    "Berkshire Hathaway": "0001067983",
    "Pershing Square": "0001336528",
}

fund_holdings = {}
for name, cik in funds.items():
    holdings = client.get_13f_holdings(cik=cik)
    fund_holdings[name] = {p.cusip: p for p in holdings.positions}

# Find overlapping positions
all_cusips = set()
for holdings in fund_holdings.values():
    all_cusips.update(holdings.keys())

print(f"{'Ticker':<8}", end="")
for name in funds:
    print(f"  {name[:18]:>18}", end="")
print()
print("-" * 50)

for cusip in all_cusips:
    row = []
    ticker = None
    for name, holdings in fund_holdings.items():
        if cusip in holdings:
            pos = holdings[cusip]
            ticker = ticker or pos.ticker or pos.issuer[:8]
            row.append(f"{pos.percentage_of_portfolio:>17.1f}%")
        else:
            row.append(f"{'':>18}")

    if ticker:
        print(f"{ticker:<8}", end="")
        for cell in row:
            print(cell, end="")
        print()
Enter fullscreen mode Exit fullscreen mode

CIK Reference: Major Institutional Investors

Here are CIK numbers for commonly tracked funds to get you started:

Fund CIK
Berkshire Hathaway 0001067983
Bridgewater Associates 0001350694
Citadel Advisors 0001423298
Renaissance Technologies 0001037389
Two Sigma Investments 0001450144
D.E. Shaw 0001009207
Tiger Global Management 0001167483
Pershing Square 0001336528
Appaloosa Management 0001070154
Druckenmiller / Duquesne 0001536411
ARK Investment Management 0001579982

Note: Some funds manage multiple entities and file separately for each. Bridgewater and Citadel in particular have several related CIKs.


What to Build With This

A few directions worth exploring:

  • "Clone a fund" portfolio tracker — Build a UI that lets users select a fund and see its current positions, updated each quarter automatically
  • Concentration analysis — Alert when a fund makes a position its top 5 holding (historically correlated with conviction buys)
  • Crowding detector — Flag when the same stock appears as a top-10 holding across 10+ funds (crowded trades tend to unwind badly)
  • Sector rotation tracker — Watch how fund-level sector weights shift quarter to quarter

Getting Started

Install the client:

pip install edgar-client
Enter fullscreen mode Exit fullscreen mode

Full source, including async support and examples for options positions and amended filings: github.com/dapdevsoftware/edgar-python

The API is available on RapidAPI: SEC EDGAR Financial Data API. There's a free tier you can use to pull a few filings and validate the data format before building anything serious.

If you're building a fintech tool or quant research pipeline on top of this and run into questions about the data structure — particularly around amended filings or options positions — drop a comment. These have some non-obvious edge cases worth discussing.

Top comments (0)