<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Yash</title>
    <description>The latest articles on DEV Community by Yash (@yash_30may05ur).</description>
    <link>https://dev.to/yash_30may05ur</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3731492%2F7e581823-73cf-4b89-91b0-bc5f6c3d28dd.png</url>
      <title>DEV Community: Yash</title>
      <link>https://dev.to/yash_30may05ur</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/yash_30may05ur"/>
    <language>en</language>
    <item>
      <title>β or w? for weight...</title>
      <dc:creator>Yash</dc:creator>
      <pubDate>Sun, 25 Jan 2026 19:00:07 +0000</pubDate>
      <link>https://dev.to/yash_30may05ur/b-or-w-for-weight-3kcl</link>
      <guid>https://dev.to/yash_30may05ur/b-or-w-for-weight-3kcl</guid>
      <description>&lt;h2&gt;
  
  
  Which one do you use &lt;u&gt;β&lt;/u&gt; or &lt;u&gt;w&lt;/u&gt; for weight {ŷ = x * &lt;strong&gt;(?)&lt;/strong&gt; + &lt;strong&gt;(?)&lt;/strong&gt;} ?
&lt;/h2&gt;




&lt;p&gt;There are many ways to write the &lt;strong&gt;&lt;em&gt;linear regression&lt;/em&gt;&lt;/strong&gt; equation let's see &lt;u&gt;two&lt;/u&gt; of them :&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. ŷ = x * w + b&lt;/strong&gt; &lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;where,
   ŷ -&amp;gt; predicted value
   x -&amp;gt; parameter / input
   w  -&amp;gt; weight factor
   b -&amp;gt; bias
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5eiu76yqj61c8e41dbh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5eiu76yqj61c8e41dbh.png" alt="Image of Scikit-Learn linear regression page" width="800" height="443"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. ŷ = x * β + β₀&lt;/strong&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;where,
   ŷ -&amp;gt; predicted value
   x -&amp;gt; parameter / input
   β  -&amp;gt; weight factor
   β ₀ -&amp;gt; bias
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvrrvzur7edlhypx1xil1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvrrvzur7edlhypx1xil1.png" alt="Image of linear regression research paper" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;When-ever your teacher teaches you machine learning basics most likely he/she uses the first example naming convention, which is desirable. But when you stumbled upon  a research paper, the story changed; you think that your knowledge is limiting your self :( and resist you from reading them . &lt;/p&gt;

&lt;p&gt;Is there any specific any reason due to this happens.....? &lt;br&gt;
The answer is &lt;strong&gt;Yes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The naming convention changes as per the field changes. When you entered first time in the field of AI, you start from Machine Learning more specific linear regression. Here, input (parameters/attribute/column) is helpful to predict element ŷ (y hat/prediction), but not every input have same &lt;strong&gt;priority&lt;/strong&gt;.  Now weight plays a significant role to decide how impactful input is to predict ŷ. This is a normal explanation of linear regression but from Computer Science perspective, that's why &lt;u&gt;w&lt;/u&gt; naming is quite meaningful to us. But in maths &lt;u&gt;β&lt;/u&gt; (beta) is used for weight or &lt;strong&gt;Slope Coefficient&lt;/strong&gt; in more mathematical way.&lt;br&gt;
When you look closely to the research paper or any other book who referring past papers. books, researcher work, etc. they all are from maths branch not from CS background this is the reason the equation they mention in their research paper/book we often see &lt;u&gt;β&lt;/u&gt; over &lt;u&gt;w&lt;/u&gt;.&lt;/p&gt;

&lt;p&gt;To reduce this confusion below table will help a lot :&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt; 
    &lt;tr&gt;  
        &lt;th rowspan="2"&gt;Name&lt;/th&gt;  
        &lt;th colspan="2"&gt;Perspective&lt;/th&gt;  
    &lt;/tr&gt;  
    &lt;tr&gt;  
        &lt;th&gt;Computer Science  
        &lt;/th&gt;
&lt;th&gt;Mathematics&lt;/th&gt;
    &lt;/tr&gt; 
    &lt;tr&gt;
        &lt;td&gt;Weights&lt;/td&gt;
        &lt;td&gt;w (Weights)&lt;/td&gt;
        &lt;td&gt;β (Beta)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Bias&lt;/td&gt;
        &lt;td&gt;b (bias)&lt;/td&gt;
        &lt;td&gt;β₀ (intercept)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Learning Rate&lt;/td&gt;
        &lt;td&gt;α (Alpha)&lt;/td&gt;
        &lt;td&gt;η (Eta)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Loss/Error&lt;/td&gt;
        &lt;td&gt;L (Loss)&lt;/td&gt;
        &lt;td&gt;E (Error)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Cost Function&lt;/td&gt;
        &lt;td&gt;J (Cost Function)&lt;/td&gt;
        &lt;td&gt;C (Cost)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Input Data&lt;/td&gt;
        &lt;td&gt;x (Features)&lt;/td&gt;
        &lt;td&gt;X (Design Matix), I (Input)&lt;/td&gt;
    &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;Output / Label&lt;/td&gt;
        &lt;td&gt;y (Target)&lt;/td&gt;
        &lt;td&gt;d (Desired)&lt;/td&gt;
    &lt;/tr&gt;
    
        &lt;tr&gt;
        &lt;td rowspan="2"&gt;Prediction&lt;/td&gt;
        &lt;td&gt;ŷ (Y-hat)&lt;/td&gt;
        &lt;td&gt;hθ​(x) (Hypothesis)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;a (Activation)&lt;/td&gt;
        &lt;td&gt;f(x)&lt;/td&gt;
    &lt;/tr&gt;
    
    &lt;tr&gt;
        &lt;td&gt;Regularization&lt;/td&gt;
        &lt;td&gt;λ (Lambda)&lt;/td&gt;
        &lt;td&gt;α (Alpha)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;/td&gt;
        &lt;td colspan="2"&gt;
&lt;b&gt;Be careful:&lt;/b&gt; Scikit-Learn uses &lt;u&gt;α (alpha)&lt;/u&gt; for &lt;b&gt;regularisation&lt;/b&gt;, but Deep Learning uses α for &lt;b&gt;learning rate&lt;/b&gt;!&lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;&lt;/div&gt;

</description>
      <category>ai</category>
      <category>research</category>
      <category>education</category>
    </item>
    <item>
      <title>β or w? for weight...</title>
      <dc:creator>Yash</dc:creator>
      <pubDate>Sun, 25 Jan 2026 18:14:02 +0000</pubDate>
      <link>https://dev.to/yash_30may05ur/b-or-w-for-weight-51id</link>
      <guid>https://dev.to/yash_30may05ur/b-or-w-for-weight-51id</guid>
      <description>&lt;h2&gt;
  
  
  Which one do you use &lt;u&gt;β&lt;/u&gt; or &lt;u&gt;w&lt;/u&gt; for weight {ŷ = x * &lt;strong&gt;(?)&lt;/strong&gt; + &lt;strong&gt;(?)&lt;/strong&gt;} ?
&lt;/h2&gt;




&lt;p&gt;There are many ways to write the &lt;strong&gt;&lt;em&gt;linear regression&lt;/em&gt;&lt;/strong&gt; equation let's see &lt;u&gt;two&lt;/u&gt; of them :&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. ŷ = x * w + b&lt;/strong&gt; &lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;where,
   ŷ -&amp;gt; predicted value
   x -&amp;gt; parameter / input
   w  -&amp;gt; weight factor
   b -&amp;gt; bias
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5eiu76yqj61c8e41dbh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5eiu76yqj61c8e41dbh.png" alt="Image of Scikit-Learn linear regression page" width="800" height="443"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. ŷ = x * β + β₀&lt;/strong&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;where,
   ŷ -&amp;gt; predicted value
   x -&amp;gt; parameter / input
   β  -&amp;gt; weight factor
   β ₀ -&amp;gt; bias
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvrrvzur7edlhypx1xil1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvrrvzur7edlhypx1xil1.png" alt="Image of linear regression research paper" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;When-ever your teacher teaches you machine learning basics most likely he/she uses the first example naming convention, which is desirable. But when you stumbled upon  a research paper, the story changed; you think that your knowledge is limiting your self :( and resist you from reading them . &lt;/p&gt;

&lt;p&gt;Is there any specific any reason due to this happens.....? &lt;br&gt;
The answer is &lt;strong&gt;Yes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The naming convention changes as per the field changes. When you entered first time in the field of AI, you start from Machine Learning more specific linear regression. Here, input (parameters/attribute/column) is helpful to predict element ŷ (y hat/prediction), but not every input have same &lt;strong&gt;priority&lt;/strong&gt;.  Now weight plays a significant role to decide how impactful input is to predict ŷ. This is a normal explanation of linear regression but from Computer Science perspective, that's why &lt;u&gt;w&lt;/u&gt; naming is quite meaningful to us. But in maths &lt;u&gt;β&lt;/u&gt; (beta) is used for weight or &lt;strong&gt;Slope Coefficient&lt;/strong&gt; in more mathematical way.&lt;br&gt;
When you look closely to the research paper or any other book who referring past papers. books, researcher work, etc. they all are from maths branch not from CS background this is the reason the equation they mention in their research paper/book we often see &lt;u&gt;β&lt;/u&gt; over &lt;u&gt;w&lt;/u&gt;.&lt;/p&gt;

&lt;p&gt;To reduce this confusion below table will help a lot :&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt; 
    &lt;tr&gt;  
        &lt;th rowspan="2"&gt;Name&lt;/th&gt;  
        &lt;th colspan="2"&gt;Perspective&lt;/th&gt;  
    &lt;/tr&gt;  
    &lt;tr&gt;  
        &lt;th&gt;Computer Science  
        &lt;/th&gt;
&lt;th&gt;Mathematics&lt;/th&gt;
    &lt;/tr&gt; 
    &lt;tr&gt;
        &lt;td&gt;Weights&lt;/td&gt;
        &lt;td&gt;w (Weights)&lt;/td&gt;
        &lt;td&gt;β (Beta)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Bias&lt;/td&gt;
        &lt;td&gt;b (bias)&lt;/td&gt;
        &lt;td&gt;β₀ (intercept)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Learning Rate&lt;/td&gt;
        &lt;td&gt;α (Alpha)&lt;/td&gt;
        &lt;td&gt;η (Eta)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Loss/Error&lt;/td&gt;
        &lt;td&gt;L (Loss)&lt;/td&gt;
        &lt;td&gt;E (Error)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Cost Function&lt;/td&gt;
        &lt;td&gt;J (Cost Function)&lt;/td&gt;
        &lt;td&gt;C (Cost)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Input Data&lt;/td&gt;
        &lt;td&gt;x (Features)&lt;/td&gt;
        &lt;td&gt;X (Design Matix), I (Input)&lt;/td&gt;
    &lt;/tr&gt;
        &lt;tr&gt;
        &lt;td&gt;Output / Label&lt;/td&gt;
        &lt;td&gt;y (Target)&lt;/td&gt;
        &lt;td&gt;d (Desired)&lt;/td&gt;
    &lt;/tr&gt;
    
        &lt;tr&gt;
        &lt;td rowspan="2"&gt;Prediction&lt;/td&gt;
        &lt;td&gt;ŷ (Y-hat)&lt;/td&gt;
        &lt;td&gt;hθ​(x) (Hypothesis)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;a (Activation)&lt;/td&gt;
        &lt;td&gt;f(x)&lt;/td&gt;
    &lt;/tr&gt;
    
    &lt;tr&gt;
        &lt;td&gt;Regularization&lt;/td&gt;
        &lt;td&gt;λ (Lambda)&lt;/td&gt;
        &lt;td&gt;α (Alpha)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;&lt;/td&gt;
        &lt;td colspan="2"&gt;
&lt;b&gt;Be careful:&lt;/b&gt; Scikit-Learn uses &lt;u&gt;α (alpha)&lt;/u&gt; for &lt;b&gt;regularisation&lt;/b&gt;, but Deep Learning uses α for &lt;b&gt;learning rate&lt;/b&gt;!&lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;&lt;/div&gt;

</description>
      <category>ai</category>
      <category>research</category>
      <category>education</category>
    </item>
  </channel>
</rss>
