<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pranshu Tiwari</title>
    <description>The latest articles on DEV Community by Pranshu Tiwari (@pranshu_tiwari_2886e14e9c).</description>
    <link>https://dev.to/pranshu_tiwari_2886e14e9c</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3692558%2F60b8374d-1056-4614-8aa2-bf924d188999.png</url>
      <title>DEV Community: Pranshu Tiwari</title>
      <link>https://dev.to/pranshu_tiwari_2886e14e9c</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pranshu_tiwari_2886e14e9c"/>
    <language>en</language>
    <item>
      <title>TRANSFORMER BASICS</title>
      <dc:creator>Pranshu Tiwari</dc:creator>
      <pubDate>Sun, 04 Jan 2026 11:54:27 +0000</pubDate>
      <link>https://dev.to/pranshu_tiwari_2886e14e9c/transformer-basics-2jlc</link>
      <guid>https://dev.to/pranshu_tiwari_2886e14e9c/transformer-basics-2jlc</guid>
      <description>&lt;p&gt;Analogy Setup:&lt;/p&gt;

&lt;p&gt;Imagine the Transformer is a Bollywood director making a blockbuster film.&lt;br&gt;
Input sentence = script idea&lt;br&gt;
Encoder = team understanding the script&lt;br&gt;
Decoder = actors performing dialogues&lt;br&gt;
Self-attention = each actor checking others’ lines to stay in context&lt;br&gt;
Cross-attention = actors looking at director’s guidance&lt;br&gt;
Final output = movie dialogue or scene&lt;br&gt;
1️⃣ Input Tokenization&lt;br&gt;
Script broken into scenes or lines&lt;br&gt;
Example:&lt;br&gt;
“I love AI” → ["I", "love", "AI"]&lt;br&gt;
Analogy: Director splits script into dialogues for actors.&lt;br&gt;
2️⃣ Input Embedding&lt;br&gt;
Each word → vector of numbers&lt;br&gt;
Captures meaning&lt;br&gt;
Analogy: Actor memorizes character traits and emotions for each dialogue line.&lt;br&gt;
3️⃣ Positional Encoding&lt;br&gt;
Adds word order info&lt;br&gt;
Analogy: Director marks scene order: First scene, second scene… to maintain storyline.&lt;br&gt;
4️⃣ Self-Attention&lt;br&gt;
Words check important words around them&lt;br&gt;
Analogy: Actors listen to other actors’ dialogues to maintain chemistry &amp;amp; context.&lt;br&gt;
“Bank” listens to “deposit” to know it’s financial, not river.&lt;br&gt;
5️⃣ Multi-Head Attention&lt;br&gt;
Multiple “attention heads” look at different aspects&lt;br&gt;
Analogy: Multiple camera angles filming: close-up, wide-shot, overhead → complete scene understanding.&lt;br&gt;
6️⃣ Add &amp;amp; Normalize (Residuals)&lt;br&gt;
Original info + attention output → stabilized&lt;br&gt;
Analogy: Actors keep original character traits while adding director’s inputs.&lt;br&gt;
7️⃣ Feed Forward Network&lt;br&gt;
Each word refined individually&lt;br&gt;
Analogy: Actors rehearse solo to perfect expressions before final scene.&lt;br&gt;
8️⃣ Decoder Input (Shifted Right)&lt;br&gt;
Decoder sees previous words only&lt;br&gt;
Analogy: Actors deliver next line based on previous dialogue, not future scenes.&lt;br&gt;
9️⃣ Masked Self-Attention&lt;br&gt;
Future words hidden&lt;br&gt;
Analogy: Actor doesn’t know upcoming twist in the movie.&lt;br&gt;
10️⃣ Encoder–Decoder Attention&lt;br&gt;
Decoder focuses on relevant encoder output&lt;br&gt;
Analogy: Actor looks at director’s notes to align with story context.&lt;br&gt;
11️⃣ Decoder FFN + Add &amp;amp; Normalize&lt;br&gt;
Refines token and stabilizes&lt;br&gt;
Analogy: Actor practices solo again, keeping director’s guidance in mind.&lt;br&gt;
12️⃣ Linear + Softmax&lt;/p&gt;

&lt;p&gt;Converts decoder output → word probabilities → final word chosen&lt;br&gt;
Analogy: Actor picks best dialogue delivery for scene.&lt;/p&gt;

&lt;p&gt;🎬 Complete Flow&lt;/p&gt;

&lt;p&gt;Script → Tokenization&lt;/p&gt;

&lt;p&gt;Actors understand script → Embedding + Positional Info&lt;/p&gt;

&lt;p&gt;Actors rehearse → Self-Attention + Multi-Head Attention&lt;/p&gt;

&lt;p&gt;Director guides them → Cross-Attention&lt;/p&gt;

&lt;p&gt;Final performance → Linear + Softmax → Scene delivered&lt;/p&gt;

&lt;p&gt;Repeat for next scene → Complete movie&lt;/p&gt;

</description>
      <category>transformer</category>
      <category>genai</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
