How I Built a Free AI Investment Signal Platform at 17 (Python + LightGBM)

#showdev #python #ai #machinelearning

Introduction
A few weeks ago I asked myself a simple question: can AI help regular people make better investment decisions?
Not hedge funds. Not Wall Street. Regular people in Latin America who want to invest in Ecopetrol, Bitcoin or Apple but don't have $24,000/year for Bloomberg Terminal.
So I built LatixIA — a free AI-powered investment signal platform. Here's exactly how I did it.

What it does
LatixIA analyzes historical market data and real-time news to generate investment signals with probability:

68% bullish — Confidence: Medium — Risk: Low
Covers 50+ assets across 9 markets (USA, China, Colombia, Mexico, Brazil, Argentina, Chile, Crypto, Commodities)
Completely free, no registration required

🔗 Live app: https://signal-engine-cgwfivvpvqotmbrqbvchhz.streamlit.app
💻 GitHub: https://github.com/JFCB24/signal-engine

The Tech Stack
Data: yfinance — 3 years of OHLCV historical data
Features: 34 technical indicators (RSI, MACD, Williams %R, OBV, Stochastic RSI...)
Model: LightGBM with walk-forward validation
NLP: RSS feeds from Yahoo Finance + Google News (real-time)
Frontend: Streamlit
Deploy: Streamlit Cloud (free tier)
Code: Python 3.14, open source, MIT license

The Data Pipeline
Everything starts with downloading 3 years of historical price data:
pythonimport yfinance as yf

datos = yf.download(
ticker,
period="3y",
interval="1d",
auto_adjust=True
)
Then I calculate 34 features across 5 categories:

Basic indicators — RSI, MACD, Bollinger Bands, ATR, volume ratio
Trend — SMA50 vs SMA200, EMA crossover, trend strength
Advanced volume — OBV, VWAP, volume spikes
Temporality — day of week, month, end of month effects
Momentum & direction — Williams %R, Stochastic RSI, ROC, distance to support/resistance

The Model
I use LightGBM with walk-forward validation — never mixing future data with past data:
pythonmodelo = lgb.LGBMClassifier(
n_estimators=300,
learning_rate=0.02,
max_depth=3,
min_child_samples=40,
num_leaves=15,
subsample=0.8,
colsample_bytree=0.7,
reg_alpha=0.1,
reg_lambda=0.1,
)
Results:

Baseline (always predict majority class): 45.0%
Model accuracy on unseen data: 54.4%
Beats baseline by: 9.5%

Not amazing. But honest. And it improves every day.

The Biggest Mistake I Made
My first model had 96% training accuracy. I thought I was a genius.
Then I tested it on real data. It only got 47% — worse than a coin flip.
The problem was overfitting. The model memorized the past instead of learning real patterns.
The fix: stronger regularization, lower depth, more out-of-sample testing.
Result: 80% training, 54% test. Much more honest. Much more useful.

Real-time News Analysis
Instead of heavy NLP models like FinBERT (too much memory for free tier), I use RSS feeds:
pythonurl = f"https://feeds.finance.yahoo.com/rss/2.0/headline?s={ticker}"
response = requests.get(url, timeout=5)

Parse XML and extract headlines

Analyze sentiment with financial word dictionary

This gets real news updated daily — zero cost, zero heavy dependencies.

What I Learned

Data quality > model complexity — garbage in, garbage out
Walk-forward validation is non-negotiable for time series
Honesty builds trust — showing 54% instead of faking 90% matters
Lightweight beats accurate on free infrastructure
Ship fast — a working product beats a perfect plan

What's Next

Retrain model with sentiment scores as features
Add portfolio comparison (analyze 3 assets at once)
WhatsApp/email alerts when strong signal detected
Premium plan for advanced users

Try It
🔗 Live: https://signal-engine-cgwfivvpvqotmbrqbvchhz.streamlit.app
💻 GitHub: https://github.com/JFCB24/signal-engine
Feedback welcome — especially from experienced ML engineers who can point out what I'm doing wrong.
I'm 17, from Colombia, studying Data Engineering & AI. This is my first real product.

DEV Community

How I Built a Free AI Investment Signal Platform at 17 (Python + LightGBM)

Parse XML and extract headlines

Analyze sentiment with financial word dictionary

Top comments (0)