DAPDEV

Posted on Feb 27

Build a Hedge Fund Portfolio Tracker with Python and SEC EDGAR Data

#python #api #finance #tutorial

Want to know what Warren Buffett is buying? Or what Citadel just dumped from their portfolio?

Every institutional investor managing more than $100 million is required to file a 13F form with the SEC every quarter, disclosing their equity holdings. This data is completely public — but working with EDGAR's raw filings is a nightmare.

In this tutorial, we'll build a Python tool that tracks institutional holdings programmatically.

The Problem with Raw EDGAR Data

If you've ever tried to pull data from SEC EDGAR, you know the pain:

CIK numbers instead of company names
Inconsistent XML/SGML formats across different filings
Rate limiting (10 requests per second)
No clean API — just raw filing documents

We'll skip all of that by using the SEC EDGAR Financial Data API which handles the parsing and gives us clean JSON responses.

Setup

Get your API key from RapidAPI (free tier: 100 requests/month), then:

pip install requests tabulate

Step 1: Search for an Institutional Investor

import requests

RAPIDAPI_KEY = "YOUR_RAPIDAPI_KEY"
BASE_URL = "https://sec-edgar-financial-data-api.p.rapidapi.com"
HEADERS = {
    "x-rapidapi-host": "sec-edgar-financial-data-api.p.rapidapi.com",
    "x-rapidapi-key": RAPIDAPI_KEY,
}

def search_company(query):
    resp = requests.get(
        f"{BASE_URL}/companies/search",
        params={"query": query},
        headers=HEADERS,
    )
    resp.raise_for_status()
    results = resp.json()
    for company in results[:5]:
        print(f"{company['name']} (CIK: {company['cik']})")
    return results

companies = search_company("Berkshire Hathaway")

Step 2: Get 13F Holdings

from tabulate import tabulate

def get_holdings(cik):
    resp = requests.get(
        f"{BASE_URL}/companies/{cik}/holdings",
        headers=HEADERS,
    )
    resp.raise_for_status()
    data = resp.json()

    holdings = data.get("holdings", [])
    total_value = sum(h.get("value", 0) for h in holdings)

    print(f"\nTotal Portfolio Value: ${total_value / 1e9:.1f}B")
    print(f"Number of Positions: {len(holdings)}\n")

    # Top 10 by value
    top = sorted(holdings, key=lambda h: h.get("value", 0), reverse=True)[:10]
    table = []
    for h in top:
        name = h.get("nameOfIssuer", "Unknown")
        value = h.get("value", 0)
        pct = (value / total_value * 100) if total_value else 0
        shares = h.get("sharesOrPrincipalAmount", 0)
        table.append([name, f"${value / 1e9:.1f}B", f"{pct:.1f}%", f"{shares:,}"])

    print(tabulate(table, headers=["Company", "Value", "% Portfolio", "Shares"]))
    return holdings

# Berkshire Hathaway's CIK
holdings = get_holdings("1067983")

Output:

Total Portfolio Value: $267.4B
Number of Positions: 42

Company              Value     % Portfolio    Shares
-------------------  --------  -------------  -----------
Apple Inc.           $91.2B    34.1%          400,000,000
Bank of America      $29.5B    11.0%          680,233,587
American Express     $26.8B    10.0%          151,610,700
Coca-Cola            $23.6B    8.8%           400,000,000
Chevron              $17.4B    6.5%           118,610,534

Step 3: Track Multiple Funds

# Well-known institutional investors
FUNDS = {
    "Berkshire Hathaway": "1067983",
    "Bridgewater Associates": "1350694",
    "Renaissance Technologies": "1037389",
    "Citadel Advisors": "1423053",
    "Two Sigma": "1179392",
}

def compare_funds():
    print("=" * 60)
    print("  INSTITUTIONAL HOLDINGS COMPARISON")
    print("=" * 60)

    for name, cik in FUNDS.items():
        print(f"\n{chr(9472) * 40}")
        print(f"  {name}")
        print(f"{chr(9472) * 40}")
        try:
            get_holdings(cik)
        except Exception as e:
            print(f"  Error: {e}")

compare_funds()

Step 4: Find Consensus Picks

One of the most interesting analyses — which stocks are multiple top funds buying?

from collections import Counter

def find_consensus_picks(fund_ciks, min_funds=3):
    all_holdings = {}

    for name, cik in fund_ciks.items():
        try:
            resp = requests.get(
                f"{BASE_URL}/companies/{cik}/holdings",
                headers=HEADERS,
            )
            resp.raise_for_status()
            data = resp.json()
            for h in data.get("holdings", []):
                ticker = h.get("nameOfIssuer", "Unknown")
                if ticker not in all_holdings:
                    all_holdings[ticker] = []
                all_holdings[ticker].append({
                    "fund": name,
                    "value": h.get("value", 0),
                    "shares": h.get("sharesOrPrincipalAmount", 0),
                })
        except Exception as e:
            print(f"Error fetching {name}: {e}")

    # Filter for stocks held by multiple funds
    consensus = {
        k: v for k, v in all_holdings.items()
        if len(v) >= min_funds
    }

    print(f"\n{= * 60}")
    print(f"  CONSENSUS PICKS (held by {min_funds}+ funds)")
    print(f"{= * 60}\n")

    for stock, funds in sorted(
        consensus.items(),
        key=lambda x: len(x[1]),
        reverse=True,
    ):
        total = sum(f["value"] for f in funds)
        fund_names = ", ".join(f["fund"] for f in funds)
        print(f"  {stock}")
        print(f"    Held by {len(funds)} funds | Total value: ${total / 1e9:.1f}B")
        print(f"    Funds: {fund_names}\n")

    return consensus

consensus = find_consensus_picks(FUNDS, min_funds=2)

What You Can Build From Here

This foundation enables several interesting applications:

Quarter-over-quarter tracking — diff holdings between filings to see what funds added, reduced, or exited
Alert system — get notified when a specific fund makes a new position or exits one
Sector analysis — aggregate holdings by sector to see where institutional money is flowing
Correlation with price — check if stock prices move after 13F disclosures (they often do)

API Reference

The SEC EDGAR Financial Data API provides these endpoints:

Endpoint	Description
`/companies/search`	Search SEC filers by name
`/companies/{cik}/holdings`	Get 13F institutional holdings
`/companies/{cik}/filings`	Get filing history (10-K, 10-Q, 8-K)

Free tier: 100 requests/month. Pro: $19/month for 5,000 requests.

Full Python wrapper on GitHub: edgar-python

Not financial advice. 13F data is delayed (filed 45 days after quarter end) and only covers long equity positions. It doesn't show short positions, options, or fixed income.

What institutional data do you track? Drop a comment — I'm curious what other data sources people are combining with 13F data.

DEV Community