DEV Community

mkash25
mkash25

Posted on

How I built a self-hosted AI financial analyst with Plaid, Claude, and Next.js

I'm Mayank — a cloud and data engineer at Snowflake with a background in network security. I built this as a personal project to solve a problem I kept running into: investments spread across five brokerages with no unified view and no real analysis. Everything here is from scratch, self-hosted, and open source.


I got tired of checking five different brokerage apps with no unified view and no real analysis. So I built a self-hosted system that connects to all my accounts, enriches every position with market data, and runs it through Claude for buy/sell/hold recommendations — automatically, three times a week.

Here's how it works and what I learned building it.


Architecture

Plaid API
    ↓
Python pipeline (agent/)
    ├── fetch holdings across all brokerages
    ├── enrich positions — RSI, MACD, Bollinger Bands, fundamentals, news
    ├── Claude analysis — structured buy/sell/hold per position
    └── sync to Supabase
              ↓
         Supabase (Postgres)
              ↓
      Next.js Dashboard
Enter fullscreen mode Exit fullscreen mode

The system is split into two components in one monorepo: a Python backend that runs on a schedule, and a Next.js dashboard that reads from Supabase.


Part 1: The Python Pipeline

Connecting accounts with Plaid

Plaid's investment API gives you unified holdings across brokerages — one access token per institution, one schema for everything. The Development tier is free for up to 100 connections, which is more than enough for personal use.

# plaid_config.py
def get_plaid_client() -> plaid_api.PlaidApi:
    configuration = plaid.Configuration(
        host=plaid.Environment.Development,
        api_key={
            "clientId": PLAID_CLIENT_ID,
            "secret": PLAID_SECRET,
        },
    )
    return plaid_api.PlaidApi(plaid.ApiClient(configuration))
Enter fullscreen mode Exit fullscreen mode

Connecting a new brokerage is a browser-based flow via Plaid Link. I built a lightweight Flask server that serves the Link UI locally, and a connect_real_account.py script that launches it:

# connect_real_account.py (simplified)
@app.route("/create_link_token", methods=["POST"])
def create_link_token():
    response = client.link_token_create({
        "user": {"client_user_id": "local-user"},
        "client_name": "Financial Analyst",
        "products": ["investments"],
        "country_codes": ["US"],
        "language": "en",
    })
    return jsonify({"link_token": response["link_token"]})
Enter fullscreen mode Exit fullscreen mode

You open http://localhost:5555, click Connect, Plaid handles the brokerage login, and the access token is saved locally. Tokens are encrypted at rest using PBKDF2+Fernet — the salt is stored separately from the ciphertext so neither file alone is useful.

# token_store.py (simplified)
def save_tokens(tokens: dict) -> None:
    salt = os.urandom(16)
    kdf = PBKDF2HMAC(algorithm=hashes.SHA256(), length=32, salt=salt, iterations=480000)
    key = base64.urlsafe_b64encode(kdf.derive(SECRET.encode()))
    encrypted = Fernet(key).encrypt(json.dumps(tokens).encode())
    Path(SALT_FILE).write_bytes(salt)
    Path(TOKEN_FILE).write_bytes(encrypted)
Enter fullscreen mode Exit fullscreen mode

Fetching holdings

Once accounts are connected, fetching holdings is straightforward. The pipeline normalizes across brokerages into a consistent schema:

holdings_response = client.investments_holdings_get({
    "access_token": token,
    "options": {"account_ids": account_ids}
})

for holding in holdings_response["holdings"]:
    security = securities_by_id[holding["security_id"]]
    normalized.append({
        "brokerage": nickname,
        "ticker": security.get("ticker_symbol"),
        "name": security.get("name"),
        "quantity": holding["quantity"],
        "cost_basis": holding.get("cost_basis"),
        "value": holding["institution_value"],
        "account_type": account_type,
    })
Enter fullscreen mode Exit fullscreen mode

One thing worth knowing: Plaid's investment data is previous close, not real-time. That's a Plaid limitation at the Development tier — fine for a 3x/week analysis pipeline, but worth being aware of.

Enriching positions

For every unique ticker, I pull technicals and fundamentals via yfinance and scrape recent news headlines. The enrichment runs in parallel across all tickers:

import yfinance as yf
from pandas_ta import rsi, macd, bbands

def enrich_ticker(ticker: str) -> dict:
    stock = yf.Ticker(ticker)
    hist = stock.history(period="6mo")
    info = stock.info

    close = hist["Close"]
    rsi_val = rsi(close, length=14).iloc[-1]
    macd_df = macd(close)
    bb = bbands(close)

    return {
        "technicals": {
            "price": close.iloc[-1],
            "sma_50": close.rolling(50).mean().iloc[-1],
            "rsi_14": rsi_val,
            "macd": macd_df["MACD_12_26_9"].iloc[-1],
            "bb_upper": bb["BBU_5_2.0"].iloc[-1],
            "bb_lower": bb["BBL_5_2.0"].iloc[-1],
        },
        "fundamentals": {
            "sector": info.get("sector"),
            "pe_trailing": info.get("trailingPE"),
            "analyst_target": info.get("targetMeanPrice"),
        },
        "performance": {
            "return_1m": ((close.iloc[-1] / close.iloc[-21]) - 1) * 100,
            "pct_from_52w_high": ((close.iloc[-1] / close.max()) - 1) * 100,
        },
        "news": [
            {"title": n["title"], "publisher": n["publisher"]}
            for n in stock.news[:5]
        ],
    }
Enter fullscreen mode Exit fullscreen mode

Getting Claude to return structured output reliably

This took the most iteration. The key was being extremely explicit in the system prompt — not just asking for JSON, but defining the exact schema including field names, types, and valid enum values.

The system prompt opens with:

You are a senior portfolio analyst. You will receive a full enriched portfolio
and return a JSON object that EXACTLY matches this schema. Do not include
markdown fences, explanation, or any text outside the JSON object.

Schema:
{
  "overall_health": "strong" | "moderate" | "weak",
  "summary": string,
  "action_items": string[],
  "recommendations": [
    {
      "ticker": string,
      "action": "BUY" | "SELL" | "HOLD",
      "confidence": "high" | "medium" | "low",
      "urgency": "immediate" | "soon" | "no_rush",
      "thesis": string,
      "bull_case": string,
      "bear_case": string,
      "key_signals": string[],
      "risk_factors": string[]
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Even with that, Claude occasionally wraps the response in markdown fences. I added a cleaning step:

cleaned = raw_response.strip()
if cleaned.startswith("```

"):
    cleaned = cleaned.split("\n", 1)[1]
if cleaned.endswith("

```"):
    cleaned = cleaned[:-3].strip()
analysis = json.loads(cleaned)
Enter fullscreen mode Exit fullscreen mode

Real token numbers from 15 actual runs: ~47K input tokens + ~15K output tokens per run. At claude-sonnet-4-6 pricing ($3/M input, $15/M output) that's ~$0.36/run, ~$5/month at 3x/week.

Syncing to Supabase

After each run, the pipeline pushes everything to Supabase: holdings, enrichment data, analysis report, and recommendations. The service role key bypasses RLS, which makes bulk writes clean:

# sync_to_supabase.py (simplified)
def sync_snapshot(snapshot: dict, analysis: dict) -> None:
    run = supabase.table("pipeline_runs").insert({
        "trigger": "scheduled",
        "status": "success",
        "model": analysis["model"],
        "input_tokens": analysis["usage"]["input_tokens"],
        "output_tokens": analysis["usage"]["output_tokens"],
    }).execute()

    run_id = run.data[0]["id"]

    supabase.table("portfolio_snapshots").insert({
        "run_id": run_id,
        "snapshot_date": today,
        "total_value": snapshot["summary"]["total_value"],
        ...
    }).execute()
Enter fullscreen mode Exit fullscreen mode

Scheduling

On macOS, I use launchd rather than cron — it survives reboots and wakes the machine if it's asleep. A template plist is included in the repo:

<!-- scheduler/macos/com.finanalyst.pipeline.plist.template -->
<key>StartCalendarInterval</key>
<array>
  <dict>
    <key>Weekday</key><integer>1</integer>
    <key>Hour</key><integer>7</integer>
    <key>Minute</key><integer>0</integer>
  </dict>
  <!-- Monday, Wednesday, Friday -->
</array>
Enter fullscreen mode Exit fullscreen mode

The installer prints the exact sed and launchctl commands to activate it.


Part 2: The Next.js Dashboard

Database schema

Supabase sits between the pipeline and the dashboard. The schema has six tables: pipeline_runs, portfolio_snapshots, holdings, enrichment, analysis_reports, and recommendations. RLS is enabled on all of them — authenticated users get read access, the service role key (pipeline-only) gets write access.

-- Row Level Security
alter table holdings enable row level security;

create policy "Auth users can read holdings"
  on holdings for select to authenticated using (true);

-- service_role bypasses RLS by default — no explicit policy needed
Enter fullscreen mode Exit fullscreen mode

Dashboard features

The dashboard is built with Next.js 14 App Router, Tailwind, Radix UI, and Recharts. Key views:

  • Net worth card — pulls latest snapshot total, broken down by account category (taxable, retirement, cash)
  • Allocation chart — recharts pie chart from asset_types_json in the snapshot
  • Holdings table — per-position gain/loss, colored by performance
  • AI recommendations — latest analysis report with per-ticker cards showing action, confidence, bull/bear case
  • RSU widget — grant tracking with live price refresh

RSU price refresh

RSU grants are stored in Supabase. When you hit the Recalc button, a server action fetches the current price from Yahoo Finance — no API key needed — and updates the grant value in real time:

// app/api/rsu-grants/recalc/route.ts
async function fetchPrice(ticker: string): Promise<number | null> {
  const url = `https://query1.finance.yahoo.com/v8/finance/chart/${ticker}?interval=1d&range=1d`;
  const res = await fetch(url, {
    headers: { "User-Agent": "Mozilla/5.0" },
    next: { revalidate: 0 },
  });
  const json = await res.json();
  return json?.chart?.result?.[0]?.meta?.regularMarketPrice ?? null;
}
Enter fullscreen mode Exit fullscreen mode

Vested units are computed server-side from the vesting schedule stored as JSONB:

function computeVestedUnits(schedule: RsuVestingEvent[]): number {
  const today = new Date();
  today.setHours(23, 59, 59, 999);
  return schedule
    .filter(e => new Date(e.date) <= today)
    .reduce((sum, e) => sum + e.units, 0);
}
Enter fullscreen mode Exit fullscreen mode

The Installer

Getting the install experience right took more work than I expected. The goal: someone clones the repo and runs one command that handles everything — Python deps, Node deps, and all API key setup — with zero silent failures.

The wizard validates every key live before writing anything to disk, using Python stdlib only (urllib) so it works before any deps are installed:

# Anthropic — hits /v1/models, zero tokens consumed
req = Request(
    "https://api.anthropic.com/v1/models",
    headers={"x-api-key": key, "anthropic-version": "2023-06-01"},
)
with urlopen(req, timeout=8) as r:
    sys.exit(0 if r.status == 200 else 1)

# Plaid — hits /institutions/get with the actual credentials
payload = json.dumps({
    "client_id": client_id, "secret": secret,
    "count": 1, "offset": 0, "country_codes": ["US"],
}).encode()
req = Request(f"{base_url}/institutions/get", data=payload,
              headers={"Content-Type": "application/json"})
Enter fullscreen mode Exit fullscreen mode

A few things I hit during testing:

  • macOS ships bash 3.2, which doesn't support associative arrays (declare -A). Had to replace the status tracking with plain variables and helper functions.
  • Python 3.12 installed via python.org doesn't land in PATH automatically. The installer now searches a list of known locations (/usr/local/bin/python3.12, /Library/Frameworks/...) before giving up.
  • hatchling requires a README.md in the package directory. Moving to a monorepo broke this — fixed by removing the readme field from pyproject.toml.

What I'd Do Differently

  • Add a token estimate before each Claude call. Token count grows with portfolio size. A pre-call estimate would let users catch runaway costs early.
  • Cache yfinance calls within a run. If the same ticker appears across multiple accounts, it gets enriched multiple times. Deduplication before enrichment would save time and API calls.
  • Plaid Development throttles refresh. Holdings data is previous close, not real-time. Worth documenting more prominently for anyone expecting live prices.

Links


Not financial advice — treat Claude's output as a starting point for your own research, not a signal to act on.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.