Week 3 done.
This week I learned shallow algorithms - Linear Regression, Logistic Regression, DBSCAN, PCA.
Not LLMs. Not ChatGPT integrations. Not the AI applications everyone's building.
Basic machine learning algorithms from decades ago.
And I kept asking myself: why am I doing this?
The Question I Keep Getting
"You're learning AI, right? When are you building something with GPT or Claude?"
Fair question. I could skip straight to LLM applications. Plenty of people do.
But here's what I realized this week: I want to understand what's actually happening, not just call APIs.
Why Shallow Algorithms First
1. They're what's actually running in production
Most companies aren't running massive neural networks. They're running:
- Logistic regression for fraud detection
- Linear regression for demand forecasting
- Clustering for customer segmentation
- PCA for feature reduction
The "boring" algorithms power real systems.
2. They teach you how ML actually works
When I call an LLM API:
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}]
)
I'm using ML. I'm not understanding ML.
When I implement Linear Regression:
from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
I see: training data → learning patterns → making predictions.
It's the same process neural networks use, just simpler.
3. I can actually debug them
If my Linear Regression model performs poorly, I can:
- Check the features
- Look at the coefficients
- Understand what's being weighted
If my LLM call gives weird results? I have no idea what's happening inside.
Starting simple means I can build intuition before jumping to black boxes.
What I Actually Learned
Linear Regression - Predicting house prices from features (square feet, bedrooms, age)
The model learns: price = w1×sqft + w2×bedrooms + w3×age + bias
It finds the best weights (w1, w2, w3) by minimizing prediction error.
This clicked because I've spent years optimizing systems. Same concept - iteratively adjust parameters to minimize error.
Logistic Regression - Classifying patients as healthy vs disease
Despite the name, it's for classification, not regression. This confused me for days.
It outputs probabilities (0 to 1). Above 0.5 → disease, below → healthy.
DBSCAN - Grouping similar pixels in images
Clusters dense regions automatically. No need to specify number of clusters upfront.
Reminded me of finding hot spots in distributed systems - same density-based grouping concept.
PCA - Reducing 100 features down to 10
Keeps the most important information, throws away the noise.
Like compressing data in a pipeline - lose some detail but keep what matters.
The Part That Frustrated Me
Hyperparameter tuning.
Every algorithm has knobs to turn:
- DBSCAN: How close is "similar"? How many points make a cluster?
- PCA: How many components to keep?
The examples work fine. My own experiments? Trial and error.
I tried clustering an image and got either:
- Everything in one giant cluster (threshold too loose)
- Everything labeled as noise (threshold too strict)
Still figuring out the intuition here.
Mistakes I Made
1. Tested on training data
model.fit(X, y) # Train on all data
score = model.score(X, y) # Test on same data
# Score: 98%! Amazing!
Except the model had already seen the answers. Not a real test.
Should have split train/test:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
# Score: 73%. More honest.
Coming from software engineering where we have staging environments, I should have known better.
2. Mixed up regression and classification
I kept using Linear Regression when I should've used Logistic Regression.
Finally internalized:
- Predicting a number (price, temperature, age) → Regression
- Predicting a category (yes/no, cat/dog, disease) → Classification
Took more failed experiments than I'd like to admit.
3. Forgot to scale features
# Features with wildly different scales
X = [[2000, 3], ...] # square_feet=2000, bedrooms=3
Square footage dominates because the numbers are bigger. Had to normalize everything to the same scale first.
The Pattern That Helped
Every scikit-learn algorithm follows the same structure:
model = SomeAlgorithm()
model.fit(X_train, y_train) # Learn from data
predictions = model.predict(X_test) # Make predictions
score = model.score(X_test, y_test) # Evaluate
Once I saw this pattern, experimenting with new algorithms got easier.
Want to try a different classifier? Swap the algorithm. Same interface.
Reminded me of how Kafka, Flink, and other stream tools have different internals but similar APIs.
Connection to Distributed Systems
Gradient descent (how these models learn) works like load balancer tuning:
Load balancing:
- Try a configuration
- Measure performance
- Adjust based on results
- Repeat
Machine learning:
- Make predictions
- Measure errors
- Adjust weights based on errors
- Repeat
Same iterative optimization. Different domain.
This mental model helped when ML concepts felt foreign.
Time Spent This Week
About 8-10 hours this week.
What I'm Taking Forward
Shallow algorithms aren't a stepping stone to "real" ML. They ARE real ML.
Most production systems use these techniques. Neural networks get the hype. Logistic regression gets deployed.
Understanding fundamentals before jumping to LLMs makes sense.
I could be building GPT wrappers right now. But I wouldn't understand:
- How training works
- Why models fail
- When to use what approach
- How to debug problems
Starting simple means I can build intuition.
You can be productive without understanding everything.
I can use these algorithms effectively even if I don't fully grasp every mathematical detail.
Understanding deepens with practice.
What's Still Unclear
- Picking the right algorithm for a new problem (I Google this constantly)
- Tuning hyperparameters systematically (still trial and error)
- Knowing when a model is "good enough"
I'm three weeks in, not three years. Still learning.
Why This Matters
In a few weeks, I'll start building LLM applications. RAG systems, agents, whatever.
But I'll understand:
- What "training" means
- How models learn patterns
- Why evaluation matters
- When simpler approaches work better
I won't just be calling APIs. I'll understand what those APIs are doing under the hood.
That's worth spending time on "boring" algorithms.
Week 3 down. Built and broke ML models this week.
Learning ML fundamentals before diving into LLMs? Or went straight to GPT APIs? Curious what path others are taking.
Top comments (0)