Every quarter, institutional investors managing over $100 million must file a 13F form with the SEC. This filing discloses every equity position they hold — stock by stock, share by share.
Berkshire Hathaway, Citadel, Renaissance Technologies, Bridgewater — they all file. And it's all public data.
The idea behind "cloning the fund" is simple: pick a fund you admire, see what they're buying and selling each quarter, and mirror their moves in your own portfolio. Whether or not you actually trade on it, tracking institutional movements is one of the best ways to understand where smart money is flowing.
This tutorial shows you how to build a fund cloning tracker in Python.
The Problem with Raw SEC Data
The SEC's EDGAR system makes 13F filings available as XML. But the data is painful to work with:
- Company names aren't normalized ("APPLE INC" vs "Apple Inc." vs "APPLE COMPUTER INC")
- Values are reported in thousands, not dollars
- No ticker symbols — just company names and CUSIPs
- Formats vary across 20+ years of filings
Parsing this manually is a weekend project you don't want.
The Clean Way: SEC EDGAR Financial Data API
The SEC EDGAR Financial Data API handles all the parsing and normalization. You get back clean JSON with resolved tickers, actual dollar values, and quarter-over-quarter diffs.
import requests
API_KEY = "your_rapidapi_key"
API_HOST = "sec-edgar-financial-data-api.p.rapidapi.com"
HEADERS = {"x-rapidapi-key": API_KEY, "x-rapidapi-host": API_HOST}
def get_13f_holdings(cik):
"""Get the latest 13F holdings for a fund."""
response = requests.get(
f"https://{API_HOST}/institutional-holdings/13f",
headers=HEADERS,
params={"cik": cik}
)
response.raise_for_status()
return response.json()
# Berkshire Hathaway's CIK
holdings = get_13f_holdings("0001067983")
for h in holdings["data"][:10]:
print(f"{h['ticker']:6} | {h['shares']:>12,} shares | ${h['value']:>14,.0f}")
Output:
AAPL | 915,560,382 shares | $348,298,945,000
BAC | 680,233,587 shares | $30,082,334,000
AXP | 151,610,700 shares | $44,898,553,000
KO | 400,000,000 shares | $28,440,000,000
CVX | 118,610,534 shares | $18,498,067,000
OXY | 264,274,424 shares | $13,216,577,000
KHC | 325,634,818 shares | $11,303,533,000
MCO | 24,669,778 shares | $12,082,179,000
CB | 27,033,784 shares | $8,204,859,000
DVA | 36,095,570 shares | $6,195,197,000
Step 1: Pick Your Funds to Track
Some popular funds to clone:
| Fund | CIK | Known For |
|---|---|---|
| Berkshire Hathaway | 0001067983 | Warren Buffett's value investing |
| Bridgewater Associates | 0001350694 | Ray Dalio's macro strategy |
| Renaissance Technologies | 0001037389 | Jim Simons' quant approach |
| Citadel Advisors | 0001423053 | Ken Griffin's multi-strategy |
| Appaloosa Management | 0001656456 | David Tepper's event-driven |
| Pershing Square | 0001336528 | Bill Ackman's concentrated bets |
FUNDS = {
"Berkshire Hathaway": "0001067983",
"Bridgewater Associates": "0001350694",
"Renaissance Technologies": "0001037389",
"Citadel Advisors": "0001423053",
"Pershing Square": "0001336528",
}
Step 2: Build the Holdings Fetcher
import time
def fetch_all_funds(funds):
"""Fetch latest 13F holdings for all tracked funds."""
all_holdings = {}
for name, cik in funds.items():
print(f"Fetching {name}...")
try:
data = get_13f_holdings(cik)
all_holdings[name] = data.get("data", [])
except Exception as e:
print(f" Error: {e}")
all_holdings[name] = []
time.sleep(1) # respect rate limits
return all_holdings
all_holdings = fetch_all_funds(FUNDS)
# Quick summary
for fund, holdings in all_holdings.items():
total = sum(h.get("value", 0) for h in holdings)
print(f"{fund}: {len(holdings)} positions, ${total:,.0f} total value")
Step 3: Find the "Consensus Picks"
The most interesting signal is when multiple smart-money funds hold the same stock. If Berkshire, Bridgewater, AND Renaissance all own it — that's a crowded conviction bet.
from collections import Counter, defaultdict
def find_consensus_picks(all_holdings, min_funds=3):
"""Find stocks held by multiple funds."""
stock_holders = defaultdict(list)
for fund_name, holdings in all_holdings.items():
for h in holdings:
ticker = h.get("ticker", "UNKNOWN")
stock_holders[ticker].append({
"fund": fund_name,
"shares": h.get("shares", 0),
"value": h.get("value", 0),
})
consensus = {
ticker: holders
for ticker, holders in stock_holders.items()
if len(holders) >= min_funds and ticker != "UNKNOWN"
}
return dict(sorted(
consensus.items(), key=lambda x: len(x[1]), reverse=True
))
consensus = find_consensus_picks(all_holdings, min_funds=2)
print(f"\nStocks held by 2+ funds:")
print(f"{'Ticker':<8} {'Funds':>5} {'Total Value':>15} Held By")
print("-" * 70)
for ticker, holders in list(consensus.items())[:20]:
total_val = sum(h["value"] for h in holders)
fund_names = ", ".join(h["fund"].split()[0] for h in holders)
print(f"{ticker:<8} {len(holders):>5} ${total_val:>14,.0f} {fund_names}")
Step 4: Track Quarter-Over-Quarter Changes
This is where it gets really interesting — not just what funds hold, but what they're buying and selling.
def get_fund_changes(cik):
"""Get quarter-over-quarter changes for a fund."""
response = requests.get(
f"https://{API_HOST}/institutional-holdings/13f",
headers=HEADERS,
params={"cik": cik, "include_changes": "true"}
)
response.raise_for_status()
return response.json()
def analyze_moves(data):
"""Categorize into new buys, increases, decreases, exits."""
new_buys, increases, decreases, exits = [], [], [], []
for h in data.get("data", []):
change = h.get("change_shares", 0)
prev = h.get("prev_shares", 0)
if prev == 0 and change > 0:
new_buys.append(h)
elif change > 0:
increases.append(h)
elif h.get("shares", 0) == 0 and prev > 0:
exits.append(h)
elif change < 0:
decreases.append(h)
return {
"new_buys": sorted(new_buys, key=lambda x: x.get("value", 0), reverse=True),
"increases": sorted(increases, key=lambda x: abs(x.get("change_shares", 0)), reverse=True),
"decreases": sorted(decreases, key=lambda x: abs(x.get("change_shares", 0)), reverse=True),
"exits": exits,
}
# Check what Berkshire is doing
berkshire_data = get_fund_changes("0001067983")
moves = analyze_moves(berkshire_data)
print("NEW POSITIONS:")
for h in moves["new_buys"][:5]:
print(f" {h['ticker']}: {h['shares']:,} shares (${h['value']:,.0f})")
print("\nINCREASED:")
for h in moves["increases"][:5]:
pct = (h["change_shares"] / max(h.get("prev_shares", 1), 1)) * 100
print(f" {h['ticker']}: +{h['change_shares']:,} shares (+{pct:.1f}%)")
print("\nDECREASED:")
for h in moves["decreases"][:5]:
pct = (abs(h["change_shares"]) / max(h.get("prev_shares", 1), 1)) * 100
print(f" {h['ticker']}: {h['change_shares']:,} shares (-{pct:.1f}%)")
print("\nEXITED:")
for h in moves["exits"][:5]:
print(f" {h['ticker']}: sold all {h.get('prev_shares', 0):,} shares")
Step 5: Build Your Clone Portfolio
Now combine everything into a simple portfolio allocation based on a fund's top holdings:
def build_clone_portfolio(holdings, top_n=10, budget=10000):
"""Create a simplified clone portfolio from top holdings."""
top = sorted(
holdings, key=lambda x: x.get("value", 0), reverse=True
)[:top_n]
total_value = sum(h.get("value", 0) for h in top)
if total_value == 0:
return []
return [
{
"ticker": h.get("ticker", "?"),
"weight": h["value"] / total_value,
"allocation": budget * (h["value"] / total_value),
}
for h in top
]
# Clone Berkshire's top 10
clone = build_clone_portfolio(
all_holdings.get("Berkshire Hathaway", []),
top_n=10, budget=10000
)
print(f"\n{'Ticker':<8} {'Weight':>8} {'Allocation':>12}")
print("-" * 30)
for p in clone:
print(f"{p['ticker']:<8} {p['weight']:>7.1%} ${p['allocation']:>10,.2f}")
print("-" * 30)
print(f"{'TOTAL':<8} {'100.0%':>8} ${sum(p['allocation'] for p in clone):>10,.2f}")
Bonus: Crowded Trade Detector
When too many funds pile into the same stock, it can signal either strong conviction OR a crowded trade vulnerable to a selloff.
def crowded_trade_alert(all_holdings, threshold=4):
"""Flag stocks held by too many institutional portfolios."""
stock_count = Counter()
stock_value = defaultdict(float)
for fund, holdings in all_holdings.items():
for h in holdings:
ticker = h.get("ticker", "UNKNOWN")
if ticker != "UNKNOWN":
stock_count[ticker] += 1
stock_value[ticker] += h.get("value", 0)
crowded = [
(ticker, count, stock_value[ticker])
for ticker, count in stock_count.items()
if count >= threshold
]
return sorted(crowded, key=lambda x: x[1], reverse=True)
alerts = crowded_trade_alert(all_holdings, threshold=3)
if alerts:
print("CROWDED TRADE ALERTS:")
for ticker, count, value in alerts[:10]:
print(f" {ticker}: held by {count} funds (${value:,.0f} combined)")
Important Caveats
Before you go clone Warren Buffett's portfolio:
- 13F filings are delayed. They're filed ~45 days after quarter end. By the time you see it, the fund may have already changed positions.
- 13F only covers long equity positions. No shorts, no options strategies, no fixed income. You're seeing one slice of a fund's total strategy.
- Context matters. A position might be a hedge, not a conviction bet. Bridgewater's positions often serve macro hedging purposes.
- Size differences. Berkshire can hold a $50B Apple position because they manage $350B+. Scaling their allocation to a $10k portfolio changes the risk profile.
That said, tracking institutional moves is genuinely useful for idea generation, understanding market sentiment, and spotting emerging trends before they hit the headlines.
Getting Started
pip install edgar-client
The Python client simplifies everything:
from edgar_client import EdgarClient
client = EdgarClient(api_key="your_key")
holdings = client.get_13f_holdings(cik="0001067983")
Resources:
- API on RapidAPI (free tier: 50 requests/month)
- Python client on GitHub
- Full SEC 13F tutorial
If you build something interesting with the 13F data — portfolio trackers, Telegram bots, dashboards — share it in the comments. I'd love to see what people create with institutional holdings data.
Top comments (0)