Raju C

Posted on Mar 22 • Edited on Mar 28

Week 3: Why I'm Learning 'Boring' ML Before Building with LLMs

#python #ai #learning #machinelearning

Week 3 done.

This week I learned shallow algorithms - Linear Regression, Logistic Regression, DBSCAN, PCA.

Not LLMs. Not ChatGPT integrations. Not the AI applications everyone's building.

Basic machine learning algorithms from decades ago.

And I kept asking myself: why am I doing this?

The Question I Keep Getting

"You're learning AI, right? When are you building something with GPT or Claude?"

Fair question. I could skip straight to LLM applications. Plenty of people do.

But here's what I realized this week: I want to understand what's actually happening, not just call APIs.

Why Shallow Algorithms First

1. They're what's actually running in production

Most companies aren't running massive neural networks. They're running:

Logistic regression for fraud detection
Linear regression for demand forecasting
Clustering for customer segmentation
PCA for feature reduction

The "boring" algorithms power real systems.

2. They teach you how ML actually works

When I call an LLM API:

response = openai.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello"}]
)

I'm using ML. I'm not understanding ML.

When I implement Linear Regression:

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)

I see: training data → learning patterns → making predictions.

It's the same process neural networks use, just simpler.

3. I can actually debug them

If my Linear Regression model performs poorly, I can:

Check the features
Look at the coefficients
Understand what's being weighted

If my LLM call gives weird results? I have no idea what's happening inside.

Starting simple means I can build intuition before jumping to black boxes.

What I Actually Learned

Linear Regression - Predicting house prices from features (square feet, bedrooms, age)

The model learns: price = w1×sqft + w2×bedrooms + w3×age + bias

It finds the best weights (w1, w2, w3) by minimizing prediction error.

This clicked because I've spent years optimizing systems. Same concept - iteratively adjust parameters to minimize error.

Logistic Regression - Classifying patients as healthy vs disease

Despite the name, it's for classification, not regression. This confused me for days.

It outputs probabilities (0 to 1). Above 0.5 → disease, below → healthy.

DBSCAN - Grouping similar pixels in images

Clusters dense regions automatically. No need to specify number of clusters upfront.

Reminded me of finding hot spots in distributed systems - same density-based grouping concept.

PCA - Reducing 100 features down to 10

Keeps the most important information, throws away the noise.

Like compressing data in a pipeline - lose some detail but keep what matters.

The Part That Frustrated Me

Hyperparameter tuning.

Every algorithm has knobs to turn:

DBSCAN: How close is "similar"? How many points make a cluster?
PCA: How many components to keep?

The examples work fine. My own experiments? Trial and error.

I tried clustering an image and got either:

Everything in one giant cluster (threshold too loose)
Everything labeled as noise (threshold too strict)

Still figuring out the intuition here.

Mistakes I Made

1. Tested on training data

model.fit(X, y)  # Train on all data
score = model.score(X, y)  # Test on same data
# Score: 98%! Amazing!

Except the model had already seen the answers. Not a real test.

Should have split train/test:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
# Score: 73%. More honest.

Coming from software engineering where we have staging environments, I should have known better.

2. Mixed up regression and classification

I kept using Linear Regression when I should've used Logistic Regression.

Finally internalized:

Predicting a number (price, temperature, age) → Regression
Predicting a category (yes/no, cat/dog, disease) → Classification

Took more failed experiments than I'd like to admit.

3. Forgot to scale features

# Features with wildly different scales
X = [[2000, 3], ...]  # square_feet=2000, bedrooms=3

Square footage dominates because the numbers are bigger. Had to normalize everything to the same scale first.

The Pattern That Helped

Every scikit-learn algorithm follows the same structure:

model = SomeAlgorithm()
model.fit(X_train, y_train)  # Learn from data
predictions = model.predict(X_test)  # Make predictions
score = model.score(X_test, y_test)  # Evaluate

Once I saw this pattern, experimenting with new algorithms got easier.

Want to try a different classifier? Swap the algorithm. Same interface.

Reminded me of how Kafka, Flink, and other stream tools have different internals but similar APIs.

Connection to Distributed Systems

Gradient descent (how these models learn) works like load balancer tuning:

Load balancing:

Try a configuration
Measure performance
Adjust based on results
Repeat

Machine learning:

Make predictions
Measure errors
Adjust weights based on errors
Repeat

Same iterative optimization. Different domain.

This mental model helped when ML concepts felt foreign.

Time Spent This Week

About 8-10 hours this week.

What I'm Taking Forward

Shallow algorithms aren't a stepping stone to "real" ML. They ARE real ML.

Most production systems use these techniques. Neural networks get the hype. Logistic regression gets deployed.

Understanding fundamentals before jumping to LLMs makes sense.

I could be building GPT wrappers right now. But I wouldn't understand:

How training works
Why models fail
When to use what approach
How to debug problems

Starting simple means I can build intuition.

You can be productive without understanding everything.

I can use these algorithms effectively even if I don't fully grasp every mathematical detail.

Understanding deepens with practice.

What's Still Unclear

Picking the right algorithm for a new problem (I Google this constantly)
Tuning hyperparameters systematically (still trial and error)
Knowing when a model is "good enough"

I'm three weeks in, not three years. Still learning.

Why This Matters

In a few weeks, I'll start building LLM applications. RAG systems, agents, whatever.

But I'll understand:

What "training" means
How models learn patterns
Why evaluation matters
When simpler approaches work better

I won't just be calling APIs. I'll understand what those APIs are doing under the hood.

That's worth spending time on "boring" algorithms.

Week 3 down. Built and broke ML models this week.

Learning ML fundamentals before diving into LLMs? Or went straight to GPT APIs? Curious what path others are taking.

DEV Community