DEV Community

Cover image for The Trinity Moment: When a Local AI Model, altFINS CLI, and an M1 Max Started Working Like One Tool
CryptoOne
CryptoOne

Posted on

The Trinity Moment: When a Local AI Model, altFINS CLI, and an M1 Max Started Working Like One Tool

There is a specific emotional jolt when a local model stops sounding clever and starts acting competent.

That was the moment here. Not because the model delivered one dramatic answer, but because it could actually work: inspect a CLI, choose commands, run filters, survive a failed path, and keep the chain together.

The trigger line in my terminal was simple:

local model online → tool access confirmed → "I can use altFINS CLI."

Three things converged on one laptop:

  • Codex as the agent layer
  • A local qwen3.6:35b-a3b-coding-nvfp4 model served by Ollama
  • The af CLI from altFINS

The whole pipeline ran on a single M1 Max, returned a structured shortlist of 13 crypto assets, and cost less than a cent in electricity.

The Numbers Up Front

Metric Value
Memory bandwidth 400 GB/s
Observed inference ~60 tokens/sec
Qualified assets 13
Active execution 5–7 minutes
Full session ~14 minutes
Electricity cost < €0.01

The Experiment

Simple to describe, surprisingly non-trivial to execute:

  1. Take the top 100 crypto assets by market cap.
  2. Keep only those still below overbought territory (RSI14 < 70).
  3. Of those, keep only ones trading above their 50-day moving average.
  4. Confirm with recent bullish signals since May 3, 2026.
  5. Pull fresh news for a sentiment layer.

A shortlist with structure, not just "the market is bullish."

What It Actually Found

The funnel collapsed cleanly:

  • 86 assets had RSI14 < 70
  • 13 of those 86 were trading above SMA50
  • All 13 had recent bullish signals since May 3, 2026

Top 5 by market cap:

# Symbol RSI14 Price SMA50 Above 50MA
1 BTC 62.0 $79,963 $73,247 +9.2%
2 ETH 49.8 $2,283 $2,222 +2.7%
3 BNB 55.9 $641 $621 +3.2%
4 SOL 58.7 $89 $85 +4.7%
5 XRP 47.7 $1.39 $1.38 +0.4%

The remaining qualifying names: DOGE, ETC, XMR, PEPE, LINK, ADA, AVAX, UNI, LTC, DOT.

Why qwen3.6 Beat Gemma 4 for This

Not a sweeping benchmark claim — a workflow claim.

On short, self-contained prompts, lots of models feel fast. On a multi-step analytical chain — intermediate symbol lists, command syntax, fallback logic, several rounds of tool output — context discipline matters more than raw speed.

qwen3.6:35b-a3b-coding-nvfp4 held together because the session was not one question. It was a sequence of sub-results that kept rewriting the next step.

One captured terminal moment makes it concrete: the agent first tried to fetch SMA50 across all 86 RSI-qualified assets, hit a messy output path, then switched routes — narrowed to top 30, then symbol-by-symbol. That kind of controlled adjustment is exactly what you want in a coding workflow.

Why the M1 Max Made It Feel Legit

The M1 Max is not magic. It is bandwidth.

Apple Silicon's unified memory design gives this class of local model room to breathe, and the headline number is up to 400 GB/s.

Observed speed: roughly 60 tokens per second. Fast enough that the loop feels interactive. You request a filter, inspect the result, push into the next step — without the whole exercise collapsing into latency theater.

Heat, Time, and the Price of Intelligence

The laptop ran hot. Honest version of the story.

Estimated cost from the log:

  • 2,000–3,000 input tokens
  • 4,000–6,000 output tokens
  • 5–7 minutes active execution
  • ~14 minutes for the full session including approvals and retries

Electricity: still under a cent. That is the part that makes cloud comparisons feel almost silly.

How Ollama and the CLI Connect

Clean mental model: the model does not browse the market on its own. The model reasons locally. The agent layer exposes tools. The af CLI is what actually reaches live altFINS endpoints.

To make the local model reachable to the agent environment:

export OLLAMA_HOST=0.0.0.0
export OLLAMA_ORIGINS="*"
ollama serve
Enter fullscreen mode Exit fullscreen mode

That's how "I can use altFINS CLI" turns from a slogan into an operational fact.

The Command Chain

For anyone who wants to reproduce the workflow, here is the full sequence the agent ran end to end:

# 1. Get top 100 markets
af markets search --size 100 -o json

# 2. Check RSI14 per symbol
af analytics history --symbol <SYM> --type RSI14 --size 1 -o json

# 3. Check SMA50 per symbol
af analytics history --symbol <SYM> --type SMA50 --size 1 -o json

# 4. Confirm recent bullish signals
af signals list --symbols <SYM> --direction BULLISH \
   --from 2026-05-03 --size 5 -o json

# 5. Pull news for sentiment layer
af news list --from 2026-05-03 --size 500 -o json
Enter fullscreen mode Exit fullscreen mode

Five primitives. One coherent workflow.

More Than a Finance Trick

I also tested the same setup on simple HTML, JavaScript, and CSS work — lightweight games and simulations. Useful as a confidence check.

But the real payoff was this financial workflow, because it mixed:

  • Scripting
  • Ranking
  • External data access
  • Retries and fallbacks
  • Plain-language explanation

…all in one session. That is the point where local AI stops feeling like a novelty and starts feeling like infrastructure for individual analysts and traders.

Session Snapshot

Model qwen3.6:35b-a3b-coding-nvfp4
Runtime Ollama on Apple Silicon M1 Max
Style Tool-driven, long-context coding workflow
Data source altFINS CLI (af)
Shortlist BTC, ETH, BNB, SOL, XRP, DOGE, ETC, XMR, PEPE, LINK, ADA, AVAX, UNI, LTC, DOT

Final Thought

The real story is not that AI "picked coins."

The real story is that a local model on a prosumer laptop successfully ran an analyst workflow: inspect, filter, retry, verify, and explain. Once that stack can really use af, the feeling changes immediately.

It is not abstract anymore. It is a warm M1 Max on a desk, a real CLI, a real shortlist, and a bill that barely exists.


Curious about the data layer? You can run the same screeners and signals manually on [altFINS.com].

Top comments (0)