<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: the_undefined_architect</title>
    <description>The latest articles on DEV Community by the_undefined_architect (@the_undefined_architect).</description>
    <link>https://dev.to/the_undefined_architect</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3873971%2Fa279615a-2cf6-4223-826d-eea964a78d93.png</url>
      <title>DEV Community: the_undefined_architect</title>
      <link>https://dev.to/the_undefined_architect</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/the_undefined_architect"/>
    <language>en</language>
    <item>
      <title>Linear Regression: Code (a) Line</title>
      <dc:creator>the_undefined_architect</dc:creator>
      <pubDate>Sat, 02 May 2026 18:19:27 +0000</pubDate>
      <link>https://dev.to/the_undefined_architect/linear-regression-code-a-line-1fhh</link>
      <guid>https://dev.to/the_undefined_architect/linear-regression-code-a-line-1fhh</guid>
      <description>&lt;p&gt;It's time to write your first ML model and predict house prices.&lt;br&gt;
To follow along, go ahead and take a look at the complete product:&lt;br&gt;
&lt;a href="https://github.com/yotambelgoroski/ml_unchained-house_pricing" rel="noopener noreferrer"&gt;https://github.com/yotambelgoroski/ml_unchained-house_pricing&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 1: It's all about data
&lt;/h2&gt;

&lt;p&gt;ML is all about data - you can't create a model without training it, and you can't train it without data.&lt;/p&gt;

&lt;p&gt;Our dataset is typically split into two parts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Training data&lt;/strong&gt; - Data used to train a model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test&lt;/strong&gt; - Once a model is trained, we can take input (x) from the test data, predict the output (ŷ), and compare that prediction to the real value (y). This tells us how well our model performs.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;In more advanced setups, you might also see a validation set, which is used to tune the model before testing it.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;
  
  
  Where does data come from?
&lt;/h3&gt;

&lt;p&gt;The answer depends on your business and use case. For learning purposes, Kaggle is a great source for datasets and ML resources. To keep things simple, I use a script that generates synthetic data.&lt;/p&gt;
&lt;h3&gt;
  
  
  How much data do I need for training?
&lt;/h3&gt;

&lt;p&gt;There is no fixed number — as model complexity increases, more data is required.&lt;/p&gt;

&lt;p&gt;A common rule of thumb is:&lt;br&gt;
&lt;code&gt;Have 10×–20× more data points than features (independent variables)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;We currently have one feature (sqm), so I used 10 records to train the model — the bare minimum to keep things simple.&lt;/p&gt;
&lt;h3&gt;
  
  
  How much data do I need for testing?
&lt;/h3&gt;

&lt;p&gt;There are several approaches, but a simple one is to split your dataset using an &lt;strong&gt;80:20&lt;/strong&gt; ratio:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;80% for training&lt;/li&gt;
&lt;li&gt;20% for testing&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Step 2: Training the model
&lt;/h2&gt;

&lt;p&gt;Now that we have our dataset, it's time to train a model.&lt;/p&gt;

&lt;p&gt;Training involves three steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load the training data&lt;/li&gt;
&lt;li&gt;Train the model in memory based on that data&lt;/li&gt;
&lt;li&gt;Serialization — save the trained model to disk so it can be reused without retraining&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here is how it looks in code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;joblib&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.linear_model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LinearRegression&lt;/span&gt;

&lt;span class="n"&gt;FEATURE_COLS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sqm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;TARGET_COL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;MODEL_FILENAME&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;house_price_model.joblib&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_training_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;LinearRegression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LinearRegression&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;FEATURE_COLS&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;TARGET_COL&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;LinearRegression&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dest_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;dest_path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mkdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exist_ok&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;joblib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dest_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Model saved → &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;dest_path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;LinearRegression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_training_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;train_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;train_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;save_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_dir&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;MODEL_FILENAME&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Model trained on &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; samples.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is it - our first model!&lt;/p&gt;

&lt;h3&gt;
  
  
  Our Dependencies
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pandas&lt;/strong&gt; — A data handling library for working with tabular data. Its core structure, the DataFrame, allows us to easily access and manipulate data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;scikit-learn&lt;/strong&gt; — A machine learning library for Python. LinearRegression is one of its models, used to learn the best linear relationship between input features and a target value.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Joblib&lt;/strong&gt; — A utility library used here for serialization. It allows us to save a trained model to disk and load it later for inference.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Congratulations — you've created your first model!&lt;/p&gt;

&lt;p&gt;However, it's not production-ready yet. Next, we’ll use the test data to evaluate how good our model really is.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>datascience</category>
      <category>machinelearning</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Linear Regression: behind the lines</title>
      <dc:creator>the_undefined_architect</dc:creator>
      <pubDate>Sat, 18 Apr 2026 11:10:19 +0000</pubDate>
      <link>https://dev.to/the_undefined_architect/linear-regression-behind-the-lines-4jpd</link>
      <guid>https://dev.to/the_undefined_architect/linear-regression-behind-the-lines-4jpd</guid>
      <description>&lt;p&gt;As I mentioned in my &lt;a href="https://dev.to/the_undefined_architect/ml-unchained-machine-learning-for-developers-5ch3"&gt;introductory post&lt;/a&gt; to this series, I don’t want math to be a hurdle for developers who want to get into AI and machine learning. You can always start coding and come back to the math when you find yourself needing it.&lt;/p&gt;

&lt;p&gt;However, there are some concepts in linear regression that you should definitely know, since they are core to many other algorithms as well.&lt;/p&gt;

&lt;p&gt;That said, I don’t believe you should memorize equations. When developing, we rely on libraries to handle that for us. But I do recommend going through this post and building a mental model of the core concepts that are rooted in math — very simple math that you can absolutely handle.&lt;/p&gt;

&lt;h2&gt;
  
  
  On The Line
&lt;/h2&gt;

&lt;p&gt;In the previous post, we predicted the price of a 250 m² house by drawing a line based on a &lt;strong&gt;dataset&lt;/strong&gt; of 10 houses.&lt;/p&gt;

&lt;p&gt;But why that line? And how did we get a predicted value (ŷ) from an input (x)?&lt;/p&gt;

&lt;p&gt;The answer is this formula:&lt;br&gt;
&lt;code&gt;ŷ = bx + a&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;a - The &lt;strong&gt;intercept&lt;/strong&gt;, where the line crosses the Y axis (when x = 0)&lt;br&gt;
b - The &lt;strong&gt;slope&lt;/strong&gt; - how steep the line is&lt;/p&gt;

&lt;p&gt;To find &lt;strong&gt;a&lt;/strong&gt; and &lt;strong&gt;b&lt;/strong&gt;, we use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;a = ((Σy)(Σx^2) - (Σx)(Σxy)) / (n(Σx^2) - (Σx)^2)
b = (n(Σxy) - (Σx)(Σy)) / (n(Σx^2) - (Σx)^2)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Don't worry! Its simple than it seems:&lt;/p&gt;

&lt;p&gt;Σ = Total sum&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Σx = Total sum of all house sizes (50 + 65 + 80 + 95 + 110 + 130 + 150 + 170 + 190 + 210 = 1250)&lt;/li&gt;
&lt;li&gt;Σy = Total sum of all house prices (140 + 210 + 180 + 260 + 240 + 330 + 310 + 420 + 390 + 470 = 2950)&lt;/li&gt;
&lt;li&gt;Σxy = Total sum of (x · y) for each row (7000 + 13650 + 14400 + 24700 + 26400 + 42900 + 46500 + 71400 + 74100 + 98700 = 419,750)&lt;/li&gt;
&lt;li&gt;Σx² = Total sum of (x · x) for each row (2500 + 4225 + 6400 + 9025 + 12100 + 16900 + 22500 + 28900 + 36100 + 44100 = 182,750)&lt;/li&gt;
&lt;li&gt;n = Total number of rows (10 houses)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;a = ((2950)(182750) - (1250)(419750)) / (10(182750) - (1250)^2) ≈ 54.43
b = (10(419750) - (1250)(2950)) / (10(182750) - (1250)^2) ≈ 1.92

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can predict the price of a 250 m² house:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ŷ = bx + a
ŷ = 1.92 * 250 + 54.43
ŷ = 534.43 ≈ 535
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or like we've seen before:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foqn6d37a05idn6pviy5e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foqn6d37a05idn6pviy5e.png" alt="Predicted price for 250" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How Good Is Our Line?
&lt;/h2&gt;

&lt;p&gt;We have a model — but is it a good one?&lt;/p&gt;

&lt;p&gt;For each data point, we compare the actual value (y) with the predicted value (ŷ). The difference is called a &lt;strong&gt;residual&lt;/strong&gt;:&lt;br&gt;
&lt;code&gt;residual = y - ŷ&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Let's calculate a few residuals using our formula&lt;br&gt;
&lt;code&gt;ŷ = 1.92x + 54.43&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cud3uexbhr90k97xmoe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4cud3uexbhr90k97xmoe.png" alt="residuals" width="" height=""&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyrsdqlcqk3alxqj4bgaw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyrsdqlcqk3alxqj4bgaw.png" alt="sctarreplot residuals" width="680" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Some residuals are positive (we under-predicted), some are negative (we over-predicted). If we simply summed them, they would cancel out — so we square them:&lt;br&gt;
&lt;code&gt;MSE = (1/n) Σ(yᵢ - ŷᵢ)²&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;Mean Squared Error (MSE)&lt;/strong&gt; — a single number that tells us how wrong the model is, on average.&lt;/p&gt;

&lt;p&gt;Our line should reflect the &lt;strong&gt;minimal MSE possible&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;How to do that? We already did.&lt;/p&gt;

&lt;p&gt;The formulas for a and b are derived by minimizing the MSE.&lt;/p&gt;

&lt;p&gt;This means the line we found is the best possible line for this data — there is no better combination of a and b.&lt;/p&gt;

&lt;p&gt;In the next post, we'll code our first model!&lt;/p&gt;

</description>
      <category>mlunchained</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
    <item>
      <title>Linear Regression: Putting things in line</title>
      <dc:creator>the_undefined_architect</dc:creator>
      <pubDate>Tue, 14 Apr 2026 14:27:24 +0000</pubDate>
      <link>https://dev.to/the_undefined_architect/linear-regression-putting-things-in-line-349c</link>
      <guid>https://dev.to/the_undefined_architect/linear-regression-putting-things-in-line-349c</guid>
      <description>&lt;p&gt;Like every developer knows, Hello World is famously the first application you build when you start learning how to code. In the same spirit, house price prediction is one of the best beginner examples for understanding how machine learning models are trained and used.&lt;/p&gt;

&lt;p&gt;Say you want to build an app that predicts house prices. How would you do it? There is no simple if-else statement that can accurately predict the price of a house in your neighborhood.&lt;/p&gt;

&lt;p&gt;So let’s simplify the problem.&lt;/p&gt;

&lt;p&gt;Imagine you collected data for 10 houses and wanted to explore whether house size can help us predict house price. Your dataset might look like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fac60skzz0dy28zi7twxx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fac60skzz0dy28zi7twxx.png" alt="Dataset of house size-price" width="476" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If we plot these points on an X and Y axis, this is what we get:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4v6tizc69fxzwpy5aa2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4v6tizc69fxzwpy5aa2.png" alt="The dataset plotted on a scatterplot" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As we can see, a correlation emerges between the size of the house and its price (duh).&lt;br&gt;
But how can we define this relationship so that, given a real value of X (the house size), we can predict the price?&lt;/p&gt;

&lt;p&gt;This is where Linear Regression comes in.&lt;/p&gt;

&lt;p&gt;Instead of trying to match every point perfectly, linear regression finds a line, called the regression line, that best represents the overall trend in the data.&lt;/p&gt;

&lt;p&gt;Here’s what that looks like:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdb924oks9cnap3bmz0bj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdb924oks9cnap3bmz0bj.png" alt="The regression line represents the overall trend" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now that we have our regression line, we can actually use it to make predictions.&lt;/p&gt;

&lt;p&gt;Let’s say we want to estimate the price of a house that is 250 m².&lt;/p&gt;

&lt;p&gt;We simply take that value (X = 250), project it onto our regression line, and get the predicted price:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foqn6d37a05idn6pviy5e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foqn6d37a05idn6pviy5e.png" alt="Predicted price for 250" width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And that’s it — we’ve trained our first model for predicting house prices.&lt;/p&gt;

&lt;p&gt;I know, I know… we didn’t go into how the model actually finds this line, or how to implement it in code. We’ll get there in the next post.&lt;/p&gt;

&lt;p&gt;For now, the goal was to give you an intuition for how Machine Learning works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how models learn from data&lt;/li&gt;
&lt;li&gt;how training shapes their behavior&lt;/li&gt;
&lt;li&gt;and most importantly — that there is no certainty, only &lt;strong&gt;probability&lt;/strong&gt; and &lt;strong&gt;prediction&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>mlunchained</category>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
    <item>
      <title>ML Unchained: Machine Learning for Developers</title>
      <dc:creator>the_undefined_architect</dc:creator>
      <pubDate>Sun, 12 Apr 2026 19:01:53 +0000</pubDate>
      <link>https://dev.to/the_undefined_architect/ml-unchained-machine-learning-for-developers-5ch3</link>
      <guid>https://dev.to/the_undefined_architect/ml-unchained-machine-learning-for-developers-5ch3</guid>
      <description>&lt;p&gt;If you’re an experienced developer, you’ve probably felt it already.&lt;/p&gt;

&lt;p&gt;Machine Learning is everywhere.&lt;br&gt;
Recommendations, pricing, search, fraud detection, and copilots.&lt;/p&gt;

&lt;p&gt;And yet — it still feels… separate from what you do.&lt;/p&gt;

&lt;p&gt;Like a different world.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why You Should Care
&lt;/h2&gt;

&lt;p&gt;Not because it’s hype.&lt;br&gt;
Because it changes how you build.&lt;/p&gt;

&lt;p&gt;As an experienced developer, you’re used to writing deterministic systems.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Given X → return Y&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;You structure it with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Functions&lt;/li&gt;
&lt;li&gt;Classes&lt;/li&gt;
&lt;li&gt;Tests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Machine Learning doesn’t replace that.&lt;br&gt;
But it introduces a component that doesn’t follow explicit rules — it predicts.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Instead of writing logic, you train it.&lt;br&gt;
Instead of exact outputs, you get probabilities.&lt;br&gt;
Instead of debugging code, you analyze behavior.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Transition Is So Hard
&lt;/h2&gt;

&lt;p&gt;Traditional developers struggle to get started with ML because the AI world feels intimidating. There’s no simple “hello world” to ease you in.&lt;/p&gt;

&lt;p&gt;It’s not that simple — but it’s not that hard either.&lt;/p&gt;

&lt;p&gt;I’ll say it clearly: you don’t need to be a math wizard to become an ML engineer. I’m not. Yes, math is involved, but as a developer you’ve already dealt with harder problems.&lt;/p&gt;

&lt;p&gt;The real issue is how people approach learning it.&lt;/p&gt;

&lt;p&gt;They think they need to:&lt;/p&gt;

&lt;p&gt;Learn the math first&lt;br&gt;
Then the theory&lt;br&gt;
And only then start coding&lt;/p&gt;

&lt;p&gt;That’s backwards.&lt;/p&gt;

&lt;p&gt;We’re engineers.&lt;/p&gt;

&lt;p&gt;We don’t learn by reading first —&lt;br&gt;
we learn by building cools shit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;That’s exactly what we’re going to do together.&lt;/p&gt;

&lt;p&gt;We’re going to dive into Machine Learning — but not through theory-first learning.&lt;/p&gt;

&lt;p&gt;We’ll go project first.&lt;/p&gt;

&lt;p&gt;We’ll build things.&lt;br&gt;
We’ll break them.&lt;br&gt;
We’ll understand how they behave.&lt;/p&gt;

&lt;p&gt;And along the way, the concepts will start to make sense — naturally.&lt;/p&gt;

&lt;p&gt;No unnecessary complexity.&lt;br&gt;
No waiting until you’re “ready.”&lt;/p&gt;

&lt;p&gt;We'll do it the engineering way, and we'll do that 500 words at a time&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
