kaustubh yerkade

Posted on Jan 1

The "Aha!" Moment: Why Understanding the JVM Changed How I Write Java

#java #spring #springboot #jvm

I Used Java for Years, Then I Finally Understood the JVM

I’ve scaled Java services, increased heap sizes and blamed GC for multiple times.

And for a long time, I still didn’t understand what the JVM was really doing.

If you’ve ever:

Given a service more RAM and watched latency get worse
Seen CPU spike while traffic was low
Hit OutOfMemoryError on a machine with free memory

This post is for you.

This isn’t a JVM reference guide. This is something I wish I had before debugging JVM issues in production.

The JVM Is Not “Java Runtime”

It’s a whole Runtime Operating System

Most people think of the JVM as:

“The thing that runs Java code.”

and I think that’s dangerously incomplete.

The JVM is:

A memory manager
A thread scheduler
A Just-In-Time (JIT) compiler
A runtime optimizer
A portable execution platform

In practice, the JVM behaves more like a mini operating system inside your OS. (Thats why its call Java Virtaul Machine)

Once you see it that way, a lot of “mysterious” behavior suddenly makes sense.

From Code to CPU: What Actually Happens

Here’s the real journey of your Java code:

.java → bytecode (.class) → interpreted → JIT compiled → machine code

Key insight:

Your Java code does not run the same way for its entire lifetime.

At startup:

Code is interpreted
It’s slow but flexible

Over time:

Hot code paths are detected
JIT compiles them into optimized native code
Optimizations are speculative and reversible

That’s why:

First requests are slow
Benchmarks lie
Restarting a JVM “fixes” performance (temporarily)

ClassLoaders: The Hidden Source of Insanity

ClassLoaders aren’t just about loading classes.
They define identity.

Two classes with the same name:

Loaded by different ClassLoaders
Are not the same class

This explains:

ClassCastException that makes no sense
Plugin architecture bugs
Dependency conflicts in fat JARs
Spring Boot classpath nightmares

So Basically

ClassLoaders are namespaces, not folders.

Once a class is loaded, unloading it is complicated.
And sometimes impossible.

JVM Memory: Why “Just Increase Heap” Is Bad Advice

This is where most JVM myths live.

JVM Memory ≠ Heap

The JVM uses more memory than you think:

Heap (Young + Old)
Metaspace (class metadata)
Thread stacks
Native memory
Direct buffers
Code cache

Important truth:

-Xmx limits the heap not the JVM.

This is why:

Containers OOM-kill Java apps
Kubernetes limits feel “ignored”
Metaspace OOMs happen unexpectedly

Why Bigger Heap Can Make Things Worse

A larger heap:

Means longer GC cycles
Increases pause time risk
Can hide memory leaks until it’s too late

In production:

A stable, smaller heap often performs better than a massive one.

Garbage Collection: Not About Speed but About Predictability

Garbage Collection exists because:

Manual memory management does not scale for humans.

But GC is not free.

Every GC strategy is a tradeoff between:

Throughput
Latency
Memory footprint
Predictability

Modern collectors (G1, ZGC, Shenandoah) optimize for:

Shorter pauses
More consistent latency

Key realization:

GC tuning is about controlling pauses, not eliminating them.

And yes Stop The World(STW) still exists.
It’s just shorter and smarter now.

JIT Compilation: Your Code Changes While Running

This part blows minds.

The JVM:

Profiles your code
Assumes patterns
Optimizes aggressively
Deoptimizes when assumptions break

This explains:

Why traffic shape matters
Why long running services behave differently
Why redeploying can reset performance

The JVM is constantly asking:

“Is this still the fastest way to run this code?”

Threads, CPU, and the Illusion of Parallelism

Java threads map to OS threads.

Which means:

Context switching is expensive
More threads ≠ more throughput
Blocking IO kills scalability

This is why:

CPU spikes on “idle” systems
Thread dumps reveal chaos
Virtual threads exist (finally)

SO Behind the scenes:

Threads compete. They don’t cooperate.

Why JVM Apps sometimes Fail in Production ?

Most JVM outages are not caused by:

Java being slow
The GC being broken
The JVM being bad !

They’re caused by:

Wrong thinking
Blind tuning
Ignoring runtime behavior
Treating JVM like a black box

Common mistakes:

Increasing heap without GC analysis
Ignoring native memory
No GC logs
No understanding of startup vs steady state

The JVM Debugging Toolkit (Minimal, Powerful)

You don’t need fancy tools to understand the JVM.
Just learning these would be sufficient:

jcmd
jstack
jmap
GC logs

Most production mysteries can be explained with these tools alone.

The JVM (If You want Remember Only One Thing)

Then Remember this:

The JVM is a self-optimizing runtime that trades predictability for performance.

Once you accept that:

Java feels less random
Performance issues feel explainable
Production debugging gets calmer

Conclusion

Java doesn’t run on your machine.

It negotiates with it.
Continuously.
Aggressively.

And once you understand the JVM, Java stops feeling slow
and starts feeling honest.

A 5 Minute GC Log Walkthrough (That Actually Helps)

GC logs look scary until you know what to look for.
Let’s decode the only things that matter.

1. Enable GC Logging (Modern JVM)

-Xlog:gc*:stdout:time,level,tags

This gives you:

When GC happened
Which collector ran
How long the pause was
How much memory was reclaimed

2. A Real GC Log Line (Simplified)

[3.421s][info][gc] GC(12) Pause Young (Normal) 512M->128M(2048M) 45.6ms

3. How to Read It (Left → Right)

3.421s
→ Time since JVM start
If this is early, you’re still warming up.

GC(12)
→ 12th GC cycle
High frequency = allocation pressure.

Pause Young
→ Minor GC
Usually healthy. Frequent is okay until latency matters.

512M->128M(2048M)
→ Before → After (Heap size)
If “after” keeps growing, you’re promoting objects too fast.

45.6ms
→ Stop-the-world pause
This is the number users feel.

4. What Healthy GC Looks Like

✔ Young GC
✔ Short pauses (<50ms)
✔ Old Gen stays mostly flat
✔ No Full GC during normal load

5. Red Flags to Watch For

Pause > 200ms → Latency spikes
Old Gen always growing → Memory leak or bad object lifetime
Frequent Full GC → Heap too small or wrong GC
GC during low traffic → Memory fragmentation

6. One Golden Rule

GC logs tell a story. Don’t read them line by line look for patterns.

Most JVM issues are visible 10 lines into the log.

7. When GC Logs Save Production

If you ever think:

“CPU is high but traffic is low”
“Latency spikes every few minutes”
“Heap increase didn’t help”

GC logs will usually explain why.

If you liked this post , Just let me in the comments.

DEV Community