mkash25

Posted on Mar 9

How I built a self-hosted AI financial analyst with Plaid, Claude, and Next.js

#ai #opensource #python #nextjs

I'm Mayank — a cloud and data engineer at Snowflake with a background in network security. I built this as a personal project to solve a problem I kept running into: investments spread across five brokerages with no unified view and no real analysis. Everything here is from scratch, self-hosted, and open source.

I got tired of checking five different brokerage apps with no unified view and no real analysis. So I built a self-hosted system that connects to all my accounts, enriches every position with market data, and runs it through Claude for buy/sell/hold recommendations — automatically, three times a week.

Here's how it works and what I learned building it.

Architecture

Plaid API
    ↓
Python pipeline (agent/)
    ├── fetch holdings across all brokerages
    ├── enrich positions — RSI, MACD, Bollinger Bands, fundamentals, news
    ├── Claude analysis — structured buy/sell/hold per position
    └── sync to Supabase
              ↓
         Supabase (Postgres)
              ↓
      Next.js Dashboard

The system is split into two components in one monorepo: a Python backend that runs on a schedule, and a Next.js dashboard that reads from Supabase.

Part 1: The Python Pipeline

Connecting accounts with Plaid

Plaid's investment API gives you unified holdings across brokerages — one access token per institution, one schema for everything. The Development tier is free for up to 100 connections, which is more than enough for personal use.

# plaid_config.py
def get_plaid_client() -> plaid_api.PlaidApi:
    configuration = plaid.Configuration(
        host=plaid.Environment.Development,
        api_key={
            "clientId": PLAID_CLIENT_ID,
            "secret": PLAID_SECRET,
        },
    )
    return plaid_api.PlaidApi(plaid.ApiClient(configuration))

Connecting a new brokerage is a browser-based flow via Plaid Link. I built a lightweight Flask server that serves the Link UI locally, and a connect_real_account.py script that launches it:

# connect_real_account.py (simplified)
@app.route("/create_link_token", methods=["POST"])
def create_link_token():
    response = client.link_token_create({
        "user": {"client_user_id": "local-user"},
        "client_name": "Financial Analyst",
        "products": ["investments"],
        "country_codes": ["US"],
        "language": "en",
    })
    return jsonify({"link_token": response["link_token"]})

You open http://localhost:5555, click Connect, Plaid handles the brokerage login, and the access token is saved locally. Tokens are encrypted at rest using PBKDF2+Fernet — the salt is stored separately from the ciphertext so neither file alone is useful.

# token_store.py (simplified)
def save_tokens(tokens: dict) -> None:
    salt = os.urandom(16)
    kdf = PBKDF2HMAC(algorithm=hashes.SHA256(), length=32, salt=salt, iterations=480000)
    key = base64.urlsafe_b64encode(kdf.derive(SECRET.encode()))
    encrypted = Fernet(key).encrypt(json.dumps(tokens).encode())
    Path(SALT_FILE).write_bytes(salt)
    Path(TOKEN_FILE).write_bytes(encrypted)

Fetching holdings

Once accounts are connected, fetching holdings is straightforward. The pipeline normalizes across brokerages into a consistent schema:

holdings_response = client.investments_holdings_get({
    "access_token": token,
    "options": {"account_ids": account_ids}
})

for holding in holdings_response["holdings"]:
    security = securities_by_id[holding["security_id"]]
    normalized.append({
        "brokerage": nickname,
        "ticker": security.get("ticker_symbol"),
        "name": security.get("name"),
        "quantity": holding["quantity"],
        "cost_basis": holding.get("cost_basis"),
        "value": holding["institution_value"],
        "account_type": account_type,
    })

One thing worth knowing: Plaid's investment data is previous close, not real-time. That's a Plaid limitation at the Development tier — fine for a 3x/week analysis pipeline, but worth being aware of.

Enriching positions

For every unique ticker, I pull technicals and fundamentals via yfinance and scrape recent news headlines. The enrichment runs in parallel across all tickers:

import yfinance as yf
from pandas_ta import rsi, macd, bbands

def enrich_ticker(ticker: str) -> dict:
    stock = yf.Ticker(ticker)
    hist = stock.history(period="6mo")
    info = stock.info

    close = hist["Close"]
    rsi_val = rsi(close, length=14).iloc[-1]
    macd_df = macd(close)
    bb = bbands(close)

    return {
        "technicals": {
            "price": close.iloc[-1],
            "sma_50": close.rolling(50).mean().iloc[-1],
            "rsi_14": rsi_val,
            "macd": macd_df["MACD_12_26_9"].iloc[-1],
            "bb_upper": bb["BBU_5_2.0"].iloc[-1],
            "bb_lower": bb["BBL_5_2.0"].iloc[-1],
        },
        "fundamentals": {
            "sector": info.get("sector"),
            "pe_trailing": info.get("trailingPE"),
            "analyst_target": info.get("targetMeanPrice"),
        },
        "performance": {
            "return_1m": ((close.iloc[-1] / close.iloc[-21]) - 1) * 100,
            "pct_from_52w_high": ((close.iloc[-1] / close.max()) - 1) * 100,
        },
        "news": [
            {"title": n["title"], "publisher": n["publisher"]}
            for n in stock.news[:5]
        ],
    }

Getting Claude to return structured output reliably

This took the most iteration. The key was being extremely explicit in the system prompt — not just asking for JSON, but defining the exact schema including field names, types, and valid enum values.

The system prompt opens with:

You are a senior portfolio analyst. You will receive a full enriched portfolio
and return a JSON object that EXACTLY matches this schema. Do not include
markdown fences, explanation, or any text outside the JSON object.

Schema:
{
  "overall_health": "strong" | "moderate" | "weak",
  "summary": string,
  "action_items": string[],
  "recommendations": [
    {
      "ticker": string,
      "action": "BUY" | "SELL" | "HOLD",
      "confidence": "high" | "medium" | "low",
      "urgency": "immediate" | "soon" | "no_rush",
      "thesis": string,
      "bull_case": string,
      "bear_case": string,
      "key_signals": string[],
      "risk_factors": string[]
    }
  ]
}

Even with that, Claude occasionally wraps the response in markdown fences. I added a cleaning step:

cleaned = raw_response.strip()
if cleaned.startswith("```

"):
    cleaned = cleaned.split("\n", 1)[1]
if cleaned.endswith("

```"):
    cleaned = cleaned[:-3].strip()
analysis = json.loads(cleaned)

Real token numbers from 15 actual runs: ~47K input tokens + ~15K output tokens per run. At claude-sonnet-4-6 pricing ($3/M input, $15/M output) that's ~$0.36/run, ~$5/month at 3x/week.

Syncing to Supabase

After each run, the pipeline pushes everything to Supabase: holdings, enrichment data, analysis report, and recommendations. The service role key bypasses RLS, which makes bulk writes clean:

# sync_to_supabase.py (simplified)
def sync_snapshot(snapshot: dict, analysis: dict) -> None:
    run = supabase.table("pipeline_runs").insert({
        "trigger": "scheduled",
        "status": "success",
        "model": analysis["model"],
        "input_tokens": analysis["usage"]["input_tokens"],
        "output_tokens": analysis["usage"]["output_tokens"],
    }).execute()

    run_id = run.data[0]["id"]

    supabase.table("portfolio_snapshots").insert({
        "run_id": run_id,
        "snapshot_date": today,
        "total_value": snapshot["summary"]["total_value"],
        ...
    }).execute()

Scheduling

On macOS, I use launchd rather than cron — it survives reboots and wakes the machine if it's asleep. A template plist is included in the repo:

<!-- scheduler/macos/com.finanalyst.pipeline.plist.template -->
<key>StartCalendarInterval</key>
<array>
  <dict>
    <key>Weekday</key><integer>1</integer>
    <key>Hour</key><integer>7</integer>
    <key>Minute</key><integer>0</integer>
  </dict>
  <!-- Monday, Wednesday, Friday -->
</array>

The installer prints the exact sed and launchctl commands to activate it.

Part 2: The Next.js Dashboard

Database schema

Supabase sits between the pipeline and the dashboard. The schema has six tables: pipeline_runs, portfolio_snapshots, holdings, enrichment, analysis_reports, and recommendations. RLS is enabled on all of them — authenticated users get read access, the service role key (pipeline-only) gets write access.

-- Row Level Security
alter table holdings enable row level security;

create policy "Auth users can read holdings"
  on holdings for select to authenticated using (true);

-- service_role bypasses RLS by default — no explicit policy needed

Dashboard features

The dashboard is built with Next.js 14 App Router, Tailwind, Radix UI, and Recharts. Key views:

Net worth card — pulls latest snapshot total, broken down by account category (taxable, retirement, cash)
Allocation chart — recharts pie chart from asset_types_json in the snapshot
Holdings table — per-position gain/loss, colored by performance
AI recommendations — latest analysis report with per-ticker cards showing action, confidence, bull/bear case
RSU widget — grant tracking with live price refresh

RSU price refresh

RSU grants are stored in Supabase. When you hit the Recalc button, a server action fetches the current price from Yahoo Finance — no API key needed — and updates the grant value in real time:

// app/api/rsu-grants/recalc/route.ts
async function fetchPrice(ticker: string): Promise<number | null> {
  const url = `https://query1.finance.yahoo.com/v8/finance/chart/${ticker}?interval=1d&range=1d`;
  const res = await fetch(url, {
    headers: { "User-Agent": "Mozilla/5.0" },
    next: { revalidate: 0 },
  });
  const json = await res.json();
  return json?.chart?.result?.[0]?.meta?.regularMarketPrice ?? null;
}

Vested units are computed server-side from the vesting schedule stored as JSONB:

function computeVestedUnits(schedule: RsuVestingEvent[]): number {
  const today = new Date();
  today.setHours(23, 59, 59, 999);
  return schedule
    .filter(e => new Date(e.date) <= today)
    .reduce((sum, e) => sum + e.units, 0);
}

The Installer

Getting the install experience right took more work than I expected. The goal: someone clones the repo and runs one command that handles everything — Python deps, Node deps, and all API key setup — with zero silent failures.

The wizard validates every key live before writing anything to disk, using Python stdlib only (urllib) so it works before any deps are installed:

# Anthropic — hits /v1/models, zero tokens consumed
req = Request(
    "https://api.anthropic.com/v1/models",
    headers={"x-api-key": key, "anthropic-version": "2023-06-01"},
)
with urlopen(req, timeout=8) as r:
    sys.exit(0 if r.status == 200 else 1)

# Plaid — hits /institutions/get with the actual credentials
payload = json.dumps({
    "client_id": client_id, "secret": secret,
    "count": 1, "offset": 0, "country_codes": ["US"],
}).encode()
req = Request(f"{base_url}/institutions/get", data=payload,
              headers={"Content-Type": "application/json"})

A few things I hit during testing:

macOS ships bash 3.2, which doesn't support associative arrays (declare -A). Had to replace the status tracking with plain variables and helper functions.
Python 3.12 installed via python.org doesn't land in PATH automatically. The installer now searches a list of known locations (/usr/local/bin/python3.12, /Library/Frameworks/...) before giving up.
hatchling requires a README.md in the package directory. Moving to a monorepo broke this — fixed by removing the readme field from pyproject.toml.

What I'd Do Differently

Add a token estimate before each Claude call. Token count grows with portfolio size. A pre-call estimate would let users catch runaway costs early.
Cache yfinance calls within a run. If the same ticker appears across multiple accounts, it gets enriched multiple times. Deduplication before enrichment would save time and API calls.
Plaid Development throttles refresh. Holdings data is previous close, not real-time. Worth documenting more prominently for anyone expecting live prices.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.

DEV Community