DEV Community

Divya Bairavarasu
Divya Bairavarasu

Posted on

Ditch the Token Bill: Run AI Stock Analysis Free with Ollama + FinSignal

Run AI Stock Analysis Locally — FinSignal with Ollama, LM Studio, and Claude

FinSignal is a Chrome extension (and standalone web app) that runs 8 specialist financial agents — technical, fundamental, sentiment, risk, earnings, and more — through a single LLM call and returns a BUY / SELL / HOLD signal with cited sources and a confidence score.

It supports three LLM backends out of the box:

  • Claude API — cloud, web-search grounded, highest quality
  • Ollama — fully local, runs on your machine, no data leaves your network
  • LM Studio — fully local, great GUI for model management

This post walks through setting up each one.


Install the Extension

  1. Grab it from the Chrome Web Store
  2. Pin it via the puzzle-piece menu so the ⬡ icon stays in your toolbar

The extension is free to install — the source repo is private, but everything you need is bundled in the published extension.


Option A — Claude API (Cloud, Recommended for Best Results)

Claude has live web_search access, so every analysis point gets grounded in real headlines and filings from the last few hours. This is the highest-fidelity path.

Setup:

  1. Get an API key at console.anthropic.com/settings/api-keys Keys start with sk-ant-api03-
  2. Click the ⬡ icon → paste your key → click Connect

That's it. Your key is stored in chrome.storage.session and cleared automatically when Chrome closes — it never leaves your browser except in direct calls to api.anthropic.com.

Settings → Provider should show Claude (Sonnet) selected. Add a ticker like NVDA, hit Run all, and you'll get a full multi-agent report in ~10 seconds.

Note: Claude is the only provider with live web search. For Ollama and LM Studio, the extension swaps in a different prompt that drops web-search references — more on what local models actually bring to the table below.


Option B — Ollama (Local, Privacy-First)

Ollama lets you run open-weight models entirely on your machine. No API key, no usage costs, no data leaving your network.

1. Install Ollama

# macOS
brew install ollama

# Or download from https://ollama.com
Enter fullscreen mode Exit fullscreen mode

Start the server:

ollama serve
# Runs at http://localhost:11434
Enter fullscreen mode Exit fullscreen mode

2. Pull a model

Gemma 3 worked really well for this use case — it follows the JSON schema reliably and produces coherent multi-section financial analysis:

ollama pull gemma3:4b       # ~3GB, runs on most laptops
ollama pull gemma3:12b      # better quality, needs ~8GB VRAM
Enter fullscreen mode Exit fullscreen mode

Other good options:

ollama pull llama3.2:3b     # fast, lighter
ollama pull mistral:7b      # solid instruction following
ollama pull qwen2.5:7b      # strong at structured output
Enter fullscreen mode Exit fullscreen mode

3. Configure in FinSignal

  1. Open the extension → Settings tab
  2. Provider → select Ollama
  3. Ollama URLhttp://localhost:11434 (default, leave as-is)
  4. Model → type the model name exactly as pulled, e.g. gemma3:4b
  5. Click Save

Now run analysis — it'll hit your local Ollama server instead of any cloud API.

Troubleshooting Ollama

CORS error in the extension popup?

The extension popup is on a chrome-extension:// origin. You need to tell Ollama to allow it:

OLLAMA_ORIGINS="chrome-extension://*" ollama serve
Enter fullscreen mode Exit fullscreen mode

Or set it permanently:

# macOS launchd
launchctl setenv OLLAMA_ORIGINS "chrome-extension://*"
Enter fullscreen mode Exit fullscreen mode

Model returns garbled or non-JSON output?

Smaller models sometimes fail to adhere to a strict JSON schema on the first try. Hit Retry — the orchestrator strips markdown fences and re-parses. If it fails repeatedly, try a larger variant (gemma3:12b over gemma3:4b).


Option C — LM Studio (Local, Great for Model Discovery)

LM Studio gives you a GUI for browsing, downloading, and running GGUF models. If you prefer not to use the CLI, this is the smoothest local experience.

1. Install LM Studio

Download from lmstudio.ai — available for macOS, Windows, and Linux.

2. Load a model

In LM Studio:

  1. Go to the Discover tab → search gemma-3 or mistral
  2. Download a Q4 or Q5 quantization (good balance of size vs quality)
  3. Go to Local Server tab → select your model → click Start Server

LM Studio runs an OpenAI-compatible server at http://localhost:1234 by default.

3. Configure in FinSignal

  1. Open the extension → Settings tab
  2. Provider → select LM Studio
  3. LM Studio URLhttp://localhost:1234
  4. Model → paste the model identifier shown in LM Studio's server tab (e.g. lmstudio-community/gemma-3-4b-it-GGUF)
  5. Click Save

Provider Comparison

Claude API Ollama LM Studio
Web search grounding ✅ Live headlines & filings ❌ Training data only ❌ Training data only
Fundamental depth Strong Strong Strong
Recency (last earnings, news) ✅ Current ⚠️ Cutoff-limited ⚠️ Cutoff-limited
Privacy Data sent to Anthropic 100% local 100% local
Cost Pay per token Free Free
Setup Paste API key CLI + model pull GUI download
Best model for this claude-sonnet-4 gemma3:4b / 12b Gemma 3 Q4/Q5
JSON schema adherence Excellent Excellent (gemma3) Excellent (gemma3)

How the Analysis Works

All 8 agents run in a single LLM call — not 8 separate requests. The orchestrator builds a combined system prompt assigning each agent role, sends one message, and parses the structured JSON response.

User → orchestrator.js
         ↓
   buildSystemPrompt()  ← 8 agent roles combined
   buildUserMessage()   ← ticker + JSON schema
         ↓
   callClaude() | callOllama() | callLMStudio()
         ↓
   Parse JSON → normalize signal
         ↓
   Zustand store → React UI
Enter fullscreen mode Exit fullscreen mode

Every analysis point in the response must include a source field. The UI silently drops any point without one — a basic anti-hallucination guardrail. Confidence is capped at 99 and calibrated to drop when agents produce conflicting signals.

When running locally (Ollama / LM Studio), the prompt drops web-search instructions and adds:

"You have NO live web access. Base analysis on your training knowledge. Prefix uncertain values with 'approximately' or 'estimated'."

This is an honesty instruction, not a capability ceiling. Models like Gemma 3 are trained on enormous amounts of financial data — SEC filings, earnings transcripts, analyst reports, 10-Ks, financial news. For well-documented tickers, that's years of synthesized coverage baked into the weights.

What the 8-agent framework does with a local model is structured knowledge extraction — forcing the model to surface what it already knows across technical, fundamental, sentiment, risk, and compliance lenses simultaneously. The result can be genuinely high-quality analysis, especially for fundamentals, sector context, business moat, and historical risk patterns.

The gap vs. Claude is specifically recency: last quarter's earnings beat, an analyst downgrade from last week, yesterday's macro event. For longer-horizon views where the fundamental picture matters more than last week's news, local models hold up well.

Links


Not financial advice. FinSignal is for informational and educational purposes only.

Top comments (0)