<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pavan Pothuganti</title>
    <description>The latest articles on DEV Community by Pavan Pothuganti (@pavan_pothuganti).</description>
    <link>https://dev.to/pavan_pothuganti</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4010443%2F4d8c60d1-a3c7-4f84-8536-6e7c39bd187d.jpg</url>
      <title>DEV Community: Pavan Pothuganti</title>
      <link>https://dev.to/pavan_pothuganti</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pavan_pothuganti"/>
    <language>en</language>
    <item>
      <title>How Does Boosting Actually Learn from Mistakes?</title>
      <dc:creator>Pavan Pothuganti</dc:creator>
      <pubDate>Sat, 04 Jul 2026 14:27:32 +0000</pubDate>
      <link>https://dev.to/pavan_pothuganti/how-does-boosting-actually-learn-from-mistakes-1jl9</link>
      <guid>https://dev.to/pavan_pothuganti/how-does-boosting-actually-learn-from-mistakes-1jl9</guid>
      <description>&lt;p&gt;In my previous article, I realized something important.&lt;/p&gt;

&lt;p&gt;Random Forest reduces variance.&lt;/p&gt;

&lt;p&gt;But reducing variance doesn't eliminate every mistake.&lt;/p&gt;

&lt;p&gt;That naturally led to another question.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;If Boosting learns from mistakes, how does it actually do that?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Does it remember the wrong predictions?&lt;/p&gt;

&lt;p&gt;Does it retrain the entire model?&lt;/p&gt;

&lt;p&gt;Does it delete bad trees?&lt;/p&gt;

&lt;p&gt;I had all of these questions.&lt;/p&gt;

&lt;p&gt;The answer turned out to be much simpler than I expected.&lt;/p&gt;

&lt;h2&gt;
  
  
  Imagine You're Learning for an Exam
&lt;/h2&gt;

&lt;p&gt;Suppose you write a mock test.&lt;/p&gt;

&lt;p&gt;You answer 100 questions.&lt;/p&gt;

&lt;p&gt;After checking the results, your teacher circles the questions you got wrong.&lt;/p&gt;

&lt;p&gt;Now imagine your teacher says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Don't study everything again."&lt;/p&gt;

&lt;p&gt;"Spend more time on the questions you missed."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's exactly what Boosting tries to do.&lt;/p&gt;

&lt;p&gt;It doesn't restart from scratch.&lt;/p&gt;

&lt;p&gt;It focuses its attention on the difficult examples.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Build the First Model
&lt;/h2&gt;

&lt;p&gt;Boosting starts with a simple model.&lt;/p&gt;

&lt;p&gt;That model makes predictions.&lt;/p&gt;

&lt;p&gt;Some predictions are correct.&lt;/p&gt;

&lt;p&gt;Some are wrong.&lt;/p&gt;

&lt;p&gt;Nothing unusual so far.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Identify the Difficult Examples
&lt;/h2&gt;

&lt;p&gt;Instead of celebrating the correct predictions, Boosting asks:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Where did I fail?"&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Those wrongly predicted records become much more important.&lt;/p&gt;

&lt;p&gt;They're no longer treated like ordinary training examples.&lt;/p&gt;

&lt;p&gt;They receive extra attention.&lt;/p&gt;

&lt;p&gt;You can think of them as being highlighted with a marker.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Build Another Model
&lt;/h2&gt;

&lt;p&gt;Now comes the interesting part.&lt;/p&gt;

&lt;p&gt;The next model isn't trained to repeat the same work.&lt;/p&gt;

&lt;p&gt;It's trained with greater emphasis on the examples the previous model struggled with.&lt;/p&gt;

&lt;p&gt;Its goal is simple.&lt;/p&gt;

&lt;p&gt;Not to replace the first model.&lt;/p&gt;

&lt;p&gt;To improve it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Repeat Again
&lt;/h2&gt;

&lt;p&gt;The second model still makes some mistakes.&lt;/p&gt;

&lt;p&gt;Now a third model focuses on those remaining errors.&lt;/p&gt;

&lt;p&gt;Then a fourth.&lt;/p&gt;

&lt;p&gt;Then a fifth.&lt;/p&gt;

&lt;p&gt;Every new model tries to improve what came before it.&lt;/p&gt;

&lt;p&gt;Instead of creating independent experts, Boosting creates a team where every member learns from the previous member's experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Is Different from Random Forest
&lt;/h2&gt;

&lt;p&gt;Random Forest trains many trees independently.&lt;/p&gt;

&lt;p&gt;None of them knows what the others predicted.&lt;/p&gt;

&lt;p&gt;It's like asking 100 students to solve an exam without allowing them to discuss the answers.&lt;/p&gt;

&lt;p&gt;Boosting is different.&lt;/p&gt;

&lt;p&gt;Every new model studies the mistakes made by the previous one before it starts learning.&lt;/p&gt;

&lt;p&gt;It's more like a teacher reviewing each student's paper before giving the next assignment.&lt;/p&gt;

&lt;p&gt;The learning process becomes sequential rather than independent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Does Boosting Memorize Mistakes?
&lt;/h2&gt;

&lt;p&gt;This was another question I had.&lt;/p&gt;

&lt;p&gt;Not exactly.&lt;/p&gt;

&lt;p&gt;Boosting doesn't simply remember wrong predictions.&lt;/p&gt;

&lt;p&gt;Instead, it changes the learning process so that difficult examples influence future models more strongly.&lt;/p&gt;

&lt;p&gt;The objective isn't to memorize.&lt;/p&gt;

&lt;p&gt;The objective is to gradually improve.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Isn't One Powerful Model Enough?
&lt;/h2&gt;

&lt;p&gt;Because one model rarely captures every pattern perfectly.&lt;/p&gt;

&lt;p&gt;Each model discovers part of the solution.&lt;/p&gt;

&lt;p&gt;The next model fills some of the remaining gaps.&lt;/p&gt;

&lt;p&gt;Over multiple iterations, the combined model becomes much stronger than any individual learner.&lt;/p&gt;

&lt;p&gt;That's why Boosting is often described as turning many weak learners into one strong learner.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happens Next?
&lt;/h2&gt;

&lt;p&gt;At this point, we understand the idea.&lt;/p&gt;

&lt;p&gt;But another question naturally appears.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;How does the algorithm decide which mistakes deserve more attention?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's where AdaBoost enters the picture.&lt;/p&gt;

&lt;p&gt;AdaBoost introduces a clever mechanism called &lt;strong&gt;sample weights&lt;/strong&gt;, allowing difficult training examples to receive progressively more importance after every iteration.&lt;/p&gt;

&lt;p&gt;We'll explore that in the next article.&lt;/p&gt;




&lt;h3&gt;
  
  
  Key Takeaway
&lt;/h3&gt;

&lt;p&gt;Boosting doesn't build many models and hope that voting fixes everything.&lt;/p&gt;

&lt;p&gt;It builds models one after another.&lt;/p&gt;

&lt;p&gt;Each new model is influenced by the mistakes made by the previous models.&lt;/p&gt;

&lt;p&gt;Instead of asking many independent experts for their opinions, Boosting creates a learning process where every new expert studies the errors of the last one before offering a better solution.&lt;/p&gt;

</description>
      <category>algorithms</category>
      <category>beginners</category>
      <category>datascience</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Why Do Decision Trees Have High Variance?</title>
      <dc:creator>Pavan Pothuganti</dc:creator>
      <pubDate>Sat, 04 Jul 2026 14:24:44 +0000</pubDate>
      <link>https://dev.to/pavan_pothuganti/why-do-decision-trees-have-high-variance-lj8</link>
      <guid>https://dev.to/pavan_pothuganti/why-do-decision-trees-have-high-variance-lj8</guid>
      <description>&lt;p&gt;Every Machine Learning course eventually says this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Decision Trees have high variance."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When I first heard that, I accepted it and moved on.&lt;/p&gt;

&lt;p&gt;But later, I stopped and asked myself a simple question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What does that actually mean?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not the textbook definition.&lt;/p&gt;

&lt;p&gt;What is the model really doing that makes everyone call it a "high variance" algorithm?&lt;/p&gt;

&lt;p&gt;That question completely changed how I understood Decision Trees.&lt;/p&gt;

&lt;h2&gt;
  
  
  Imagine Building Two Decision Trees
&lt;/h2&gt;

&lt;p&gt;Suppose you have a dataset with 10,000 customer records.&lt;/p&gt;

&lt;p&gt;You train a Decision Tree.&lt;/p&gt;

&lt;p&gt;Now imagine removing just a few hundred records and training the model again.&lt;/p&gt;

&lt;p&gt;You might expect the new tree to look almost identical.&lt;/p&gt;

&lt;p&gt;After all:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The algorithm is the same.&lt;/li&gt;
&lt;li&gt;Most of the data is the same.&lt;/li&gt;
&lt;li&gt;The problem hasn't changed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Surprisingly, that's often &lt;strong&gt;not&lt;/strong&gt; what happens.&lt;/p&gt;

&lt;p&gt;The new tree may choose a different root feature.&lt;/p&gt;

&lt;p&gt;Different splits.&lt;/p&gt;

&lt;p&gt;Different branches.&lt;/p&gt;

&lt;p&gt;Different predictions.&lt;/p&gt;

&lt;p&gt;A tiny change in the training data can completely reshape the tree.&lt;/p&gt;

&lt;p&gt;That isn't a bug.&lt;/p&gt;

&lt;p&gt;It's the nature of Decision Trees.&lt;/p&gt;

&lt;h2&gt;
  
  
  But Why Does This Happen?
&lt;/h2&gt;

&lt;p&gt;A Decision Tree builds itself one split at a time.&lt;/p&gt;

&lt;p&gt;At every step, it asks:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Which feature gives me the best split right now?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Sometimes two features are almost equally good.&lt;/p&gt;

&lt;p&gt;A small change in the training data can make Feature A slightly better than Feature B.&lt;/p&gt;

&lt;p&gt;Once the root node changes, everything below it changes as well.&lt;/p&gt;

&lt;p&gt;It's like taking a different road at the first intersection.&lt;/p&gt;

&lt;p&gt;Even though the destination is the same, the entire journey becomes different.&lt;/p&gt;

&lt;p&gt;One small decision near the top creates a completely different tree.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Domino Effect
&lt;/h2&gt;

&lt;p&gt;Think about a family tree.&lt;/p&gt;

&lt;p&gt;If the first branch changes, every branch below it changes too.&lt;/p&gt;

&lt;p&gt;Decision Trees behave in a similar way.&lt;/p&gt;

&lt;p&gt;A different root node leads to different child nodes.&lt;/p&gt;

&lt;p&gt;Different child nodes lead to different grandchildren.&lt;/p&gt;

&lt;p&gt;One early decision affects the entire structure.&lt;/p&gt;

&lt;p&gt;That's why even a small change in the data can produce a very different model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Is That a Problem?
&lt;/h2&gt;

&lt;p&gt;Imagine predicting whether a customer will buy a product.&lt;/p&gt;

&lt;p&gt;You train one Decision Tree today.&lt;/p&gt;

&lt;p&gt;Tomorrow, you collect a little more data and train it again.&lt;/p&gt;

&lt;p&gt;Now the predictions change noticeably.&lt;/p&gt;

&lt;p&gt;The model isn't stable.&lt;/p&gt;

&lt;p&gt;It reacts strongly to changes in the training data.&lt;/p&gt;

&lt;p&gt;That instability is exactly what machine learning calls &lt;strong&gt;high variance&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The issue isn't that Decision Trees are inaccurate.&lt;/p&gt;

&lt;p&gt;The issue is that they're sensitive.&lt;/p&gt;

&lt;h2&gt;
  
  
  Does High Variance Mean Decision Trees Are Bad?
&lt;/h2&gt;

&lt;p&gt;Not at all.&lt;/p&gt;

&lt;p&gt;Decision Trees are powerful because they can learn complex patterns without requiring feature scaling or linear relationships.&lt;/p&gt;

&lt;p&gt;The trade-off is that this flexibility makes them more likely to overfit the training data.&lt;/p&gt;

&lt;p&gt;They're excellent learners.&lt;/p&gt;

&lt;p&gt;Sometimes they're just a little too eager to memorize.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Question That Naturally Follows
&lt;/h2&gt;

&lt;p&gt;Once I understood why Decision Trees have high variance, another question came to mind.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;If the problem is instability, why not train many Decision Trees instead of trusting just one?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That simple question led me to Bagging and, eventually, Random Forest.&lt;/p&gt;

&lt;p&gt;And that's exactly where the next article begins.&lt;/p&gt;




&lt;h3&gt;
  
  
  Key Takeaway
&lt;/h3&gt;

&lt;p&gt;A Decision Tree has high variance not because it is a poor algorithm, but because it is highly sensitive to the data it learns from.&lt;/p&gt;

&lt;p&gt;Even a small change in the training data can produce a completely different tree.&lt;/p&gt;

&lt;p&gt;Understanding that single idea makes it much easier to understand why Bagging and Random Forest were created.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>decisiontree</category>
      <category>highvariance</category>
    </item>
    <item>
      <title>If Random Forest Already Reduces Variance, Why Do We Still Need Boosting?</title>
      <dc:creator>Pavan Pothuganti</dc:creator>
      <pubDate>Sat, 04 Jul 2026 14:22:26 +0000</pubDate>
      <link>https://dev.to/pavan_pothuganti/if-random-forest-already-reduces-variance-why-do-we-still-need-boosting-1fdp</link>
      <guid>https://dev.to/pavan_pothuganti/if-random-forest-already-reduces-variance-why-do-we-still-need-boosting-1fdp</guid>
      <description>&lt;p&gt;After learning Decision Trees, I understood why they overfit.&lt;/p&gt;

&lt;p&gt;After learning Bagging, I understood how training multiple trees makes predictions more stable.&lt;/p&gt;

&lt;p&gt;After learning Random Forest, I thought I had reached the final destination.&lt;/p&gt;

&lt;p&gt;Then I discovered another family of algorithms:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Boosting.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;My immediate question was simple.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;If Random Forest already solved the problem, why did researchers invent Boosting?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The answer completely changed how I think about machine learning models.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Mistake I Was Making
&lt;/h2&gt;

&lt;p&gt;I assumed reducing variance meant reducing errors.&lt;/p&gt;

&lt;p&gt;Those sound similar.&lt;/p&gt;

&lt;p&gt;They're not.&lt;/p&gt;

&lt;p&gt;Reducing variance simply means making the model &lt;strong&gt;more stable&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It does &lt;strong&gt;not&lt;/strong&gt; mean the model suddenly becomes perfect.&lt;/p&gt;

&lt;p&gt;That distinction is easy to miss.&lt;/p&gt;

&lt;h2&gt;
  
  
  Imagine a Classroom
&lt;/h2&gt;

&lt;p&gt;Suppose 100 students solve the same exam paper.&lt;/p&gt;

&lt;p&gt;Instead of trusting one student, you decide to trust the majority.&lt;/p&gt;

&lt;p&gt;If one student makes a silly mistake, the others correct it.&lt;/p&gt;

&lt;p&gt;That's exactly what Random Forest does.&lt;/p&gt;

&lt;p&gt;It replaces the opinion of one Decision Tree with the collective opinion of many trees.&lt;/p&gt;

&lt;p&gt;Random mistakes become much less important.&lt;/p&gt;

&lt;p&gt;But here's the interesting part.&lt;/p&gt;

&lt;h2&gt;
  
  
  What If Every Student Doesn't Know One Chapter?
&lt;/h2&gt;

&lt;p&gt;Imagine every student skipped the same chapter before the exam.&lt;/p&gt;

&lt;p&gt;Now everyone answers one question incorrectly.&lt;/p&gt;

&lt;p&gt;Does asking 100 students help?&lt;/p&gt;

&lt;p&gt;No.&lt;/p&gt;

&lt;p&gt;The majority is still wrong.&lt;/p&gt;

&lt;p&gt;This is exactly what can happen in Random Forest.&lt;/p&gt;

&lt;p&gt;If every tree struggles with a particular pattern, majority voting cannot invent the correct answer.&lt;/p&gt;

&lt;p&gt;The model has become more stable.&lt;/p&gt;

&lt;p&gt;It hasn't become all-knowing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stability Isn't the Same as Learning
&lt;/h2&gt;

&lt;p&gt;This was the biggest realization for me.&lt;/p&gt;

&lt;p&gt;Random Forest mainly answers this question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How can we make predictions more consistent?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Boosting answers a completely different question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How can we improve the mistakes that still remain?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Those are not the same objective.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Different Philosophy
&lt;/h2&gt;

&lt;p&gt;Random Forest builds many trees independently.&lt;/p&gt;

&lt;p&gt;Each tree finishes its work without knowing what the others predicted.&lt;/p&gt;

&lt;p&gt;Boosting works differently.&lt;/p&gt;

&lt;p&gt;It builds one model.&lt;/p&gt;

&lt;p&gt;Then it studies where that model failed.&lt;/p&gt;

&lt;p&gt;The next model is trained to pay more attention to those difficult cases.&lt;/p&gt;

&lt;p&gt;When that model finishes, another model focuses on the remaining errors.&lt;/p&gt;

&lt;p&gt;Instead of asking many models for independent opinions, Boosting creates a sequence of models where each one learns from the previous one.&lt;/p&gt;

&lt;p&gt;It's more like coaching than voting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Both Algorithms Exist
&lt;/h2&gt;

&lt;p&gt;Random Forest is excellent when the main issue is instability.&lt;/p&gt;

&lt;p&gt;Boosting is powerful when you want to squeeze out the remaining errors by continuously improving the model.&lt;/p&gt;

&lt;p&gt;Neither algorithm replaces the other.&lt;/p&gt;

&lt;p&gt;They solve different problems.&lt;/p&gt;

&lt;p&gt;One focuses on stability.&lt;/p&gt;

&lt;p&gt;The other focuses on improvement.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Question That Changed My Understanding
&lt;/h2&gt;

&lt;p&gt;I stopped asking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Which algorithm is better?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Instead, I started asking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"What problem is this algorithm trying to solve?"&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That single question made ensemble learning much easier to understand.&lt;/p&gt;

&lt;p&gt;Instead of memorizing algorithms, I began understanding the reason they exist.&lt;/p&gt;

&lt;p&gt;And once I understood the reason, remembering the algorithms became effortless.&lt;/p&gt;




&lt;h3&gt;
  
  
  Key Takeaway
&lt;/h3&gt;

&lt;p&gt;Random Forest reduces the randomness of Decision Trees.&lt;/p&gt;

&lt;p&gt;Boosting reduces the mistakes that still remain after that randomness has been controlled.&lt;/p&gt;

&lt;p&gt;One algorithm stabilizes learning.&lt;/p&gt;

&lt;p&gt;The other continuously improves learning.&lt;/p&gt;

&lt;p&gt;That difference is why both continue to be among the most important ensemble techniques in machine learning.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>randomforest</category>
      <category>boosting</category>
    </item>
    <item>
      <title>If Bagging Already Uses 100 Trees, Why Was Random Forest Invented?</title>
      <dc:creator>Pavan Pothuganti</dc:creator>
      <pubDate>Sat, 04 Jul 2026 14:12:44 +0000</pubDate>
      <link>https://dev.to/pavan_pothuganti/if-bagging-already-uses-100-trees-why-was-random-forest-invented-4kab</link>
      <guid>https://dev.to/pavan_pothuganti/if-bagging-already-uses-100-trees-why-was-random-forest-invented-4kab</guid>
      <description>&lt;p&gt;After finally understanding Bagging, I thought I was done with ensemble learning.&lt;/p&gt;

&lt;p&gt;The idea made sense.&lt;/p&gt;

&lt;p&gt;Take multiple bootstrap samples.&lt;/p&gt;

&lt;p&gt;Train multiple Decision Trees.&lt;/p&gt;

&lt;p&gt;Combine their predictions using majority voting.&lt;/p&gt;

&lt;p&gt;Variance decreases.&lt;/p&gt;

&lt;p&gt;Simple.&lt;/p&gt;

&lt;p&gt;Then I came across another algorithm:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Random Forest.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;My first reaction was honest:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Wait... isn't this just Bagging with a fancy name?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It turns out the answer is &lt;strong&gt;no&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And the reason is surprisingly interesting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Bagging Solves One Problem
&lt;/h2&gt;

&lt;p&gt;Bagging makes Decision Trees more stable.&lt;/p&gt;

&lt;p&gt;Each tree is trained on a different bootstrap sample, so they don't all learn exactly the same data.&lt;/p&gt;

&lt;p&gt;That reduces variance.&lt;/p&gt;

&lt;p&gt;At this point, I assumed every tree would become completely different.&lt;/p&gt;

&lt;p&gt;But that's not always true.&lt;/p&gt;

&lt;h2&gt;
  
  
  Different Data Doesn't Mean Different Thinking
&lt;/h2&gt;

&lt;p&gt;Imagine you're building a model to predict house prices.&lt;/p&gt;

&lt;p&gt;Your features are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Location&lt;/li&gt;
&lt;li&gt;Area&lt;/li&gt;
&lt;li&gt;Number of bedrooms&lt;/li&gt;
&lt;li&gt;Age of the house&lt;/li&gt;
&lt;li&gt;Parking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Suppose &lt;strong&gt;Area&lt;/strong&gt; is by far the strongest predictor.&lt;/p&gt;

&lt;p&gt;Even though every tree receives a different bootstrap dataset, most of them will still discover the same thing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Area is the best feature to split on."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So what happens?&lt;/p&gt;

&lt;p&gt;Tree after tree starts with the same root node.&lt;/p&gt;

&lt;p&gt;Many of them grow in very similar ways.&lt;/p&gt;

&lt;p&gt;They are trained on different data, but they still think alike.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Is That a Problem?
&lt;/h2&gt;

&lt;p&gt;Imagine asking 100 people the same question.&lt;/p&gt;

&lt;p&gt;If every person has exactly the same information and thinks in exactly the same way, you'll probably hear the same answer 100 times.&lt;/p&gt;

&lt;p&gt;Even if that answer is wrong.&lt;/p&gt;

&lt;p&gt;Now compare that with asking 100 experts from different backgrounds.&lt;/p&gt;

&lt;p&gt;One notices something others missed.&lt;/p&gt;

&lt;p&gt;Another approaches the problem differently.&lt;/p&gt;

&lt;p&gt;The diversity of opinions often leads to a better final decision.&lt;/p&gt;

&lt;p&gt;Random Forest tries to create that diversity.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Extra Randomness
&lt;/h2&gt;

&lt;p&gt;Bagging changes the &lt;strong&gt;rows&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Random Forest changes both the &lt;strong&gt;rows&lt;/strong&gt; and the &lt;strong&gt;features&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of allowing every tree to examine every feature, Random Forest randomly selects a subset of features whenever a split is made.&lt;/p&gt;

&lt;p&gt;Now imagine the strongest feature isn't available for a particular split.&lt;/p&gt;

&lt;p&gt;The tree is forced to explore another path.&lt;/p&gt;

&lt;p&gt;One tree may begin with &lt;strong&gt;Area&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Another may begin with &lt;strong&gt;Location&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Another may start with &lt;strong&gt;Age&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The trees become less similar.&lt;/p&gt;

&lt;p&gt;And that's exactly what we want.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Diversity Matters
&lt;/h2&gt;

&lt;p&gt;If every tree makes the same mistake, majority voting cannot help.&lt;/p&gt;

&lt;p&gt;If different trees make different mistakes, majority voting becomes much more powerful.&lt;/p&gt;

&lt;p&gt;Random Forest doesn't just build more trees.&lt;/p&gt;

&lt;p&gt;It builds &lt;strong&gt;more independent trees&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That small difference is what makes the algorithm so effective.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Lesson That Changed My Perspective
&lt;/h2&gt;

&lt;p&gt;For a long time, I thought Random Forest was simply:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Bagging + a random trick."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now I see it differently.&lt;/p&gt;

&lt;p&gt;Bagging asks:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How do we make Decision Trees more stable?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Random Forest asks:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"How do we make those trees think differently?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Those are two completely different questions.&lt;/p&gt;

&lt;p&gt;And that's why Random Forest usually performs better than plain Bagging with Decision Trees.&lt;/p&gt;




&lt;h3&gt;
  
  
  Key Takeaway
&lt;/h3&gt;

&lt;p&gt;Bagging creates multiple Decision Trees using different datasets.&lt;/p&gt;

&lt;p&gt;Random Forest goes one step further by ensuring those trees don't all rely on the same features.&lt;/p&gt;

&lt;p&gt;It's not about creating &lt;strong&gt;more trees&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It's about creating &lt;strong&gt;more diverse trees&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Sometimes, diversity is more valuable than quantity.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>bagging</category>
      <category>randonforest</category>
    </item>
    <item>
      <title>If Decision Trees Have High Variance, Why Does Bagging Actually Work?</title>
      <dc:creator>Pavan Pothuganti</dc:creator>
      <pubDate>Sat, 04 Jul 2026 14:07:58 +0000</pubDate>
      <link>https://dev.to/pavan_pothuganti/if-decision-trees-have-high-variance-why-does-bagging-actually-work-2758</link>
      <guid>https://dev.to/pavan_pothuganti/if-decision-trees-have-high-variance-why-does-bagging-actually-work-2758</guid>
      <description>&lt;p&gt;When I first learned about Decision Trees, everyone said the same thing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Decision Trees have high variance."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then they immediately introduced Bagging and Random Forest.&lt;/p&gt;

&lt;p&gt;At first, I accepted it.&lt;/p&gt;

&lt;p&gt;Later, one question kept bothering me:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does training 100 Decision Trees suddenly solve the problem?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After all, if one tree makes mistakes, why would building 99 more trees magically improve anything?&lt;/p&gt;

&lt;p&gt;That question completely changed how I understood ensemble learning.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem Isn't That Decision Trees Are "Bad"
&lt;/h2&gt;

&lt;p&gt;Imagine training a Decision Tree on a dataset of 10,000 records.&lt;/p&gt;

&lt;p&gt;Now remove just a small percentage of those records and train another tree.&lt;/p&gt;

&lt;p&gt;Surprisingly, the new tree may have a completely different structure.&lt;/p&gt;

&lt;p&gt;Different root node.&lt;/p&gt;

&lt;p&gt;Different branches.&lt;/p&gt;

&lt;p&gt;Different predictions.&lt;/p&gt;

&lt;p&gt;The algorithm didn't change.&lt;/p&gt;

&lt;p&gt;The problem didn't change.&lt;/p&gt;

&lt;p&gt;Only a small part of the training data changed.&lt;/p&gt;

&lt;p&gt;That is what people mean when they say a Decision Tree has &lt;strong&gt;high variance&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It reacts strongly to small changes in the training data.&lt;/p&gt;

&lt;h2&gt;
  
  
  My First Wrong Assumption
&lt;/h2&gt;

&lt;p&gt;Initially I thought:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"If one tree is unstable, then training many unstable trees should make the situation even worse."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That sounds logical.&lt;/p&gt;

&lt;p&gt;But that's not what actually happens.&lt;/p&gt;

&lt;p&gt;The secret lies in &lt;strong&gt;how&lt;/strong&gt; those trees are trained.&lt;/p&gt;

&lt;h2&gt;
  
  
  Every Tree Sees a Different World
&lt;/h2&gt;

&lt;p&gt;Bagging doesn't clone the same Decision Tree 100 times.&lt;/p&gt;

&lt;p&gt;Instead, it creates multiple bootstrap datasets.&lt;/p&gt;

&lt;p&gt;Each dataset contains mostly the same records, but not exactly the same ones.&lt;/p&gt;

&lt;p&gt;Every tree learns from a slightly different version of reality.&lt;/p&gt;

&lt;p&gt;As a result:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One tree may overfit one noisy pattern.&lt;/li&gt;
&lt;li&gt;Another tree may never even see that noisy pattern.&lt;/li&gt;
&lt;li&gt;A third tree may split the data in a completely different way.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each tree develops its own strengths and weaknesses.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Power Isn't the Trees
&lt;/h2&gt;

&lt;p&gt;The real power is the &lt;strong&gt;disagreement&lt;/strong&gt; between them.&lt;/p&gt;

&lt;p&gt;Suppose you're trying to classify an image.&lt;/p&gt;

&lt;p&gt;Tree 1 predicts &lt;strong&gt;Cat&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Tree 2 predicts &lt;strong&gt;Cat&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Tree 3 predicts &lt;strong&gt;Dog&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Tree 4 predicts &lt;strong&gt;Cat&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Tree 5 predicts &lt;strong&gt;Cat&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;One tree made a mistake.&lt;/p&gt;

&lt;p&gt;Four didn't.&lt;/p&gt;

&lt;p&gt;Instead of trusting one unstable model, Bagging trusts the collective decision.&lt;/p&gt;

&lt;p&gt;The random mistakes made by individual trees are often cancelled out by the majority.&lt;/p&gt;

&lt;p&gt;That is why variance decreases.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Question That Came Next
&lt;/h2&gt;

&lt;p&gt;After understanding this, another question immediately came to mind.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"What if all 100 trees are wrong?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And the answer surprised me.&lt;/p&gt;

&lt;p&gt;Yes, it can happen.&lt;/p&gt;

&lt;p&gt;Imagine the training data itself is missing an important feature.&lt;/p&gt;

&lt;p&gt;Or the labels contain systematic errors.&lt;/p&gt;

&lt;p&gt;Every bootstrap sample is created from that same dataset.&lt;/p&gt;

&lt;p&gt;Every tree learns the same incorrect pattern.&lt;/p&gt;

&lt;p&gt;Now every tree confidently makes the same wrong prediction.&lt;/p&gt;

&lt;p&gt;Voting cannot fix missing knowledge.&lt;/p&gt;

&lt;p&gt;Bagging only reduces &lt;strong&gt;random instability&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It cannot magically invent information that doesn't exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  That's When Everything Clicked
&lt;/h2&gt;

&lt;p&gt;I realized Bagging doesn't promise perfection.&lt;/p&gt;

&lt;p&gt;It promises &lt;strong&gt;stability&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A single Decision Tree may change dramatically when the training data changes.&lt;/p&gt;

&lt;p&gt;Bagging makes the final prediction much more consistent by combining many different trees.&lt;/p&gt;

&lt;p&gt;Some mistakes disappear because they were caused by randomness.&lt;/p&gt;

&lt;p&gt;Other mistakes remain because they come from genuinely difficult patterns in the data.&lt;/p&gt;

&lt;p&gt;Those remaining mistakes eventually led researchers to develop another family of algorithms called &lt;strong&gt;Boosting&lt;/strong&gt;, which takes a completely different approach.&lt;/p&gt;

&lt;p&gt;But that's a story for the next article.&lt;/p&gt;




&lt;h3&gt;
  
  
  Key Takeaway
&lt;/h3&gt;

&lt;p&gt;I stopped thinking of Bagging as "100 Decision Trees."&lt;/p&gt;

&lt;p&gt;Now I think of it as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;A method that replaces the opinion of one unstable model with the collective wisdom of many independently trained models.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That single idea made ensemble learning much easier to understand.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>decisiontree</category>
      <category>bagging</category>
      <category>highvariance</category>
    </item>
  </channel>
</rss>
