Run AI Stock Analysis Locally — FinSignal with Ollama, LM Studio, and Claude
FinSignal is a Chrome extension (and standalone web app) that runs 8 specialist financial agents — technical, fundamental, sentiment, risk, earnings, and more — through a single LLM call and returns a BUY / SELL / HOLD signal with cited sources and a confidence score.
It supports three LLM backends out of the box:
- Claude API — cloud, web-search grounded, highest quality
- Ollama — fully local, runs on your machine, no data leaves your network
- LM Studio — fully local, great GUI for model management
This post walks through setting up each one.
Install the Extension
- Grab it from the Chrome Web Store
- Pin it via the puzzle-piece menu so the ⬡ icon stays in your toolbar
The extension is free to install — the source repo is private, but everything you need is bundled in the published extension.
Option A — Claude API (Cloud, Recommended for Best Results)
Claude has live web_search access, so every analysis point gets grounded in real headlines and filings from the last few hours. This is the highest-fidelity path.
Setup:
- Get an API key at console.anthropic.com/settings/api-keys
Keys start with
sk-ant-api03- - Click the ⬡ icon → paste your key → click Connect
That's it. Your key is stored in chrome.storage.session and cleared automatically when Chrome closes — it never leaves your browser except in direct calls to api.anthropic.com.
Settings → Provider should show Claude (Sonnet) selected. Add a ticker like NVDA, hit Run all, and you'll get a full multi-agent report in ~10 seconds.
Note: Claude is the only provider with live web search. For Ollama and LM Studio, the extension swaps in a different prompt that drops web-search references — more on what local models actually bring to the table below.
Option B — Ollama (Local, Privacy-First)
Ollama lets you run open-weight models entirely on your machine. No API key, no usage costs, no data leaving your network.
1. Install Ollama
# macOS
brew install ollama
# Or download from https://ollama.com
Start the server:
ollama serve
# Runs at http://localhost:11434
2. Pull a model
Gemma 3 worked really well for this use case — it follows the JSON schema reliably and produces coherent multi-section financial analysis:
ollama pull gemma3:4b # ~3GB, runs on most laptops
ollama pull gemma3:12b # better quality, needs ~8GB VRAM
Other good options:
ollama pull llama3.2:3b # fast, lighter
ollama pull mistral:7b # solid instruction following
ollama pull qwen2.5:7b # strong at structured output
3. Configure in FinSignal
- Open the extension → Settings tab
- Provider → select Ollama
-
Ollama URL →
http://localhost:11434(default, leave as-is) -
Model → type the model name exactly as pulled, e.g.
gemma3:4b - Click Save
Now run analysis — it'll hit your local Ollama server instead of any cloud API.
Troubleshooting Ollama
CORS error in the extension popup?
The extension popup is on a chrome-extension:// origin. You need to tell Ollama to allow it:
OLLAMA_ORIGINS="chrome-extension://*" ollama serve
Or set it permanently:
# macOS launchd
launchctl setenv OLLAMA_ORIGINS "chrome-extension://*"
Model returns garbled or non-JSON output?
Smaller models sometimes fail to adhere to a strict JSON schema on the first try. Hit Retry — the orchestrator strips markdown fences and re-parses. If it fails repeatedly, try a larger variant (gemma3:12b over gemma3:4b).
Option C — LM Studio (Local, Great for Model Discovery)
LM Studio gives you a GUI for browsing, downloading, and running GGUF models. If you prefer not to use the CLI, this is the smoothest local experience.
1. Install LM Studio
Download from lmstudio.ai — available for macOS, Windows, and Linux.
2. Load a model
In LM Studio:
- Go to the Discover tab → search
gemma-3ormistral - Download a Q4 or Q5 quantization (good balance of size vs quality)
- Go to Local Server tab → select your model → click Start Server
LM Studio runs an OpenAI-compatible server at http://localhost:1234 by default.
3. Configure in FinSignal
- Open the extension → Settings tab
- Provider → select LM Studio
-
LM Studio URL →
http://localhost:1234 -
Model → paste the model identifier shown in LM Studio's server tab (e.g.
lmstudio-community/gemma-3-4b-it-GGUF) - Click Save
Provider Comparison
| Claude API | Ollama | LM Studio | |
|---|---|---|---|
| Web search grounding | ✅ Live headlines & filings | ❌ Training data only | ❌ Training data only |
| Fundamental depth | Strong | Strong | Strong |
| Recency (last earnings, news) | ✅ Current | ⚠️ Cutoff-limited | ⚠️ Cutoff-limited |
| Privacy | Data sent to Anthropic | 100% local | 100% local |
| Cost | Pay per token | Free | Free |
| Setup | Paste API key | CLI + model pull | GUI download |
| Best model for this | claude-sonnet-4 | gemma3:4b / 12b | Gemma 3 Q4/Q5 |
| JSON schema adherence | Excellent | Excellent (gemma3) | Excellent (gemma3) |
How the Analysis Works
All 8 agents run in a single LLM call — not 8 separate requests. The orchestrator builds a combined system prompt assigning each agent role, sends one message, and parses the structured JSON response.
User → orchestrator.js
↓
buildSystemPrompt() ← 8 agent roles combined
buildUserMessage() ← ticker + JSON schema
↓
callClaude() | callOllama() | callLMStudio()
↓
Parse JSON → normalize signal
↓
Zustand store → React UI
Every analysis point in the response must include a source field. The UI silently drops any point without one — a basic anti-hallucination guardrail. Confidence is capped at 99 and calibrated to drop when agents produce conflicting signals.
When running locally (Ollama / LM Studio), the prompt drops web-search instructions and adds:
"You have NO live web access. Base analysis on your training knowledge. Prefix uncertain values with 'approximately' or 'estimated'."
This is an honesty instruction, not a capability ceiling. Models like Gemma 3 are trained on enormous amounts of financial data — SEC filings, earnings transcripts, analyst reports, 10-Ks, financial news. For well-documented tickers, that's years of synthesized coverage baked into the weights.
What the 8-agent framework does with a local model is structured knowledge extraction — forcing the model to surface what it already knows across technical, fundamental, sentiment, risk, and compliance lenses simultaneously. The result can be genuinely high-quality analysis, especially for fundamentals, sector context, business moat, and historical risk patterns.
The gap vs. Claude is specifically recency: last quarter's earnings beat, an analyst downgrade from last week, yesterday's macro event. For longer-horizon views where the fundamental picture matters more than last week's news, local models hold up well.
Links
Not financial advice. FinSignal is for informational and educational purposes only.
Top comments (0)