Introduction
A few weeks ago I asked myself a simple question: can AI help regular people make better investment decisions?
Not hedge funds. Not Wall Street. Regular people in Latin America who want to invest in Ecopetrol, Bitcoin or Apple but don't have $24,000/year for Bloomberg Terminal.
So I built LatixIA β a free AI-powered investment signal platform. Here's exactly how I did it.
What it does
LatixIA analyzes historical market data and real-time news to generate investment signals with probability:
68% bullish β Confidence: Medium β Risk: Low
Covers 50+ assets across 9 markets (USA, China, Colombia, Mexico, Brazil, Argentina, Chile, Crypto, Commodities)
Completely free, no registration required
π Live app: https://signal-engine-cgwfivvpvqotmbrqbvchhz.streamlit.app
π» GitHub: https://github.com/JFCB24/signal-engine
The Tech Stack
Data: yfinance β 3 years of OHLCV historical data
Features: 34 technical indicators (RSI, MACD, Williams %R, OBV, Stochastic RSI...)
Model: LightGBM with walk-forward validation
NLP: RSS feeds from Yahoo Finance + Google News (real-time)
Frontend: Streamlit
Deploy: Streamlit Cloud (free tier)
Code: Python 3.14, open source, MIT license
The Data Pipeline
Everything starts with downloading 3 years of historical price data:
pythonimport yfinance as yf
datos = yf.download(
ticker,
period="3y",
interval="1d",
auto_adjust=True
)
Then I calculate 34 features across 5 categories:
- Basic indicators β RSI, MACD, Bollinger Bands, ATR, volume ratio
- Trend β SMA50 vs SMA200, EMA crossover, trend strength
- Advanced volume β OBV, VWAP, volume spikes
- Temporality β day of week, month, end of month effects
- Momentum & direction β Williams %R, Stochastic RSI, ROC, distance to support/resistance
The Model
I use LightGBM with walk-forward validation β never mixing future data with past data:
pythonmodelo = lgb.LGBMClassifier(
n_estimators=300,
learning_rate=0.02,
max_depth=3,
min_child_samples=40,
num_leaves=15,
subsample=0.8,
colsample_bytree=0.7,
reg_alpha=0.1,
reg_lambda=0.1,
)
Results:
Baseline (always predict majority class): 45.0%
Model accuracy on unseen data: 54.4%
Beats baseline by: 9.5%
Not amazing. But honest. And it improves every day.
The Biggest Mistake I Made
My first model had 96% training accuracy. I thought I was a genius.
Then I tested it on real data. It only got 47% β worse than a coin flip.
The problem was overfitting. The model memorized the past instead of learning real patterns.
The fix: stronger regularization, lower depth, more out-of-sample testing.
Result: 80% training, 54% test. Much more honest. Much more useful.
Real-time News Analysis
Instead of heavy NLP models like FinBERT (too much memory for free tier), I use RSS feeds:
pythonurl = f"https://feeds.finance.yahoo.com/rss/2.0/headline?s={ticker}"
response = requests.get(url, timeout=5)
Parse XML and extract headlines
Analyze sentiment with financial word dictionary
This gets real news updated daily β zero cost, zero heavy dependencies.
What I Learned
Data quality > model complexity β garbage in, garbage out
Walk-forward validation is non-negotiable for time series
Honesty builds trust β showing 54% instead of faking 90% matters
Lightweight beats accurate on free infrastructure
Ship fast β a working product beats a perfect plan
What's Next
Retrain model with sentiment scores as features
Add portfolio comparison (analyze 3 assets at once)
WhatsApp/email alerts when strong signal detected
Premium plan for advanced users
Try It
π Live: https://signal-engine-cgwfivvpvqotmbrqbvchhz.streamlit.app
π» GitHub: https://github.com/JFCB24/signal-engine
Feedback welcome β especially from experienced ML engineers who can point out what I'm doing wrong.
I'm 17, from Colombia, studying Data Engineering & AI. This is my first real product.

Top comments (0)