<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Raju C</title>
    <description>The latest articles on DEV Community by Raju C (@raju_ch_0f28d).</description>
    <link>https://dev.to/raju_ch_0f28d</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1540506%2F2ef6bd86-5137-4684-b7f9-6c275018ab46.jpg</url>
      <title>DEV Community: Raju C</title>
      <link>https://dev.to/raju_ch_0f28d</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/raju_ch_0f28d"/>
    <language>en</language>
    <item>
      <title>Week 5: RAG Systems and AI Agents - Where Distributed Systems Meet LLMs</title>
      <dc:creator>Raju C</dc:creator>
      <pubDate>Sun, 12 Apr 2026 22:25:02 +0000</pubDate>
      <link>https://dev.to/raju_ch_0f28d/week-5-rag-systems-and-ai-agents-where-distributed-systems-meet-llms-4mfa</link>
      <guid>https://dev.to/raju_ch_0f28d/week-5-rag-systems-and-ai-agents-where-distributed-systems-meet-llms-4mfa</guid>
      <description>&lt;p&gt;Week 5 done.&lt;/p&gt;

&lt;p&gt;This week: RAG systems and AI agents - making LLMs actually useful with real data.&lt;/p&gt;

&lt;p&gt;This week was about &lt;strong&gt;building systems around LLMs, not just calling APIs.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Shift: From Models to Systems
&lt;/h2&gt;

&lt;p&gt;Week 4 was about training models. Week 5 was about building systems around them.&lt;/p&gt;

&lt;p&gt;Neural networks are the engine. RAG and agents are the architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Actually Built
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. ArXiv Research Assistant (RAG)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The problem:&lt;/strong&gt; Ask natural language questions about the latest AI/ML research papers from ArXiv. Built with Retrieval-Augmented Generation (RAG) — answers are grounded in actual paper content, not the LLM's training memory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The solution:&lt;/strong&gt; Build a RAG system that grounds answers in actual papers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ArXiv RSS Feed (cs.AI + cs.LG)
    ↓
Fetch 30 recent papers → chunk into 300-word segments
    ↓
Embed with sentence-transformers → ChromaDB
    ↓
Question → Semantic Search → Top 5 chunks → GPT-4o-mini → Answer + Citations
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The architecture:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Embed the question
&lt;/span&gt;    &lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Search vector database
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chroma_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;query_embeddings&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;query_embedding&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;n_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 3. Build context from retrieved chunks
&lt;/span&gt;    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;documents&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="c1"&gt;# 4. Augment LLM prompt with context
&lt;/span&gt;    &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Based on these papers:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Answer: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# 5. Generate grounded answer
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In distributed systems, we cache expensive operations: &lt;code&gt;Request → Cache → Database → Compute&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;RAG is the same: &lt;code&gt;Query → Vector DB → Document Store → LLM&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key decisions:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chunk size: 300 words&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Smaller than my earlier experiments (1000 tokens)&lt;/li&gt;
&lt;li&gt;Research papers need precise citations&lt;/li&gt;
&lt;li&gt;300 words = one key finding per chunk&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Overlap strategy:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No overlap initially → context cuts mid-sentence&lt;/li&gt;
&lt;li&gt;Added semantic overlap → better coherence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Results:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Indexed 30 recent papers in ~1 minute&lt;/li&gt;
&lt;li&gt;Query latency: ~500ms end-to-end&lt;/li&gt;
&lt;li&gt;Answers cite actual paper sections, not hallucinations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Multi-Phase Task Assistant (AI Agents)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The problem:&lt;/strong&gt; LLMs can't take actions, maintain state across tasks, or orchestrate multi-step workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The solution:&lt;/strong&gt; Build an agent system with tool calling and phase management.&lt;/p&gt;

&lt;p&gt;Built a task assistant that manages multi-phase workflows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User conversation → LLM decides which tools to call → Execute tools → Update state → Next phase
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The architecture:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TaskAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="n"&gt;get_weather_tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;# External API
&lt;/span&gt;            &lt;span class="n"&gt;decrypt_message_tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Utility function
&lt;/span&gt;            &lt;span class="n"&gt;generate_cipher_tool&lt;/span&gt;   &lt;span class="c1"&gt;# Utility function
&lt;/span&gt;        &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_state&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Phase tracking
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Build system prompt with current state
&lt;/span&gt;        &lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_phase&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# LLM decides which tools to call
&lt;/span&gt;        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conversation_history&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Execute tool calls
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;conversation_history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Check if phase should advance
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;should_advance_phase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;advance&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Tool calling in action:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "I'm heading to Berlin for a task. What should I prepare?"

Agent thinking:
1. Detects location mention → calls get_weather("Berlin")
2. Weather API returns: 4°C, overcast, wind 15km/h
3. LLM uses real data: "Berlin is 4°C and overcast. Pack warm layers..."

User: "I received this encrypted note: KHOOR DJHQW with shift 3"

Agent thinking:
1. Detects cipher pattern → calls decrypt_caesar("KHOOR DJHQW", 3)
2. Decrypt tool returns: "HELLO AGENT"
3. LLM responds: "Decoded message: HELLO AGENT. Proceed to checkpoint."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Phase management:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The agent tracks multi-step workflows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# 4-phase workflow example
&lt;/span&gt;&lt;span class="n"&gt;phases&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;travel&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;preparation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;execution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completion&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Each phase has validation logic
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_phase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;travel&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;check_destination_confirmed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;phase&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;preparation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;check_equipment_selected&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ... etc
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;State persistence:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"session_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"abc123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"current_phase"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"preparation"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"destination"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Berlin"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"weather"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"4°C, overcast"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"equipment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"winter coat"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"encrypted device"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"conversation_history"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In microservices: services maintain state in distributed databases.&lt;br&gt;
In AI agents: agents maintain state in JSON/database, rebuild context each turn.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-time architecture:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Flask app + Socket.IO
    ↓
User sends message via websocket
    ↓
Agent processes with tool calling loop
    ↓
Tools execute (API calls, computations)
    ↓
State updates (phase advancement, context)
    ↓
Response streamed back via websocket
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What surprised me:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The LLM is remarkably good at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deciding when to call tools (no explicit instructions needed)&lt;/li&gt;
&lt;li&gt;Extracting parameters from natural language&lt;/li&gt;
&lt;li&gt;Chaining multiple tool calls to solve complex requests&lt;/li&gt;
&lt;li&gt;Understanding phase context and constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. MCP Integration
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The problem:&lt;/strong&gt; Every integration needs custom code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The solution:&lt;/strong&gt; MCP (Model Context Protocol) - standard protocol for AI tools.&lt;/p&gt;

&lt;p&gt;Built an MCP server exposing REST API endpoints as tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp_server.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_papers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;vector_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MCP is like a service mesh for AI - standard protocol, any tool plugs in.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Frustrated Me Most
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;State management across tool calls.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When an agent makes multiple tool calls in one turn:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Weather API call → wait for response&lt;/li&gt;
&lt;li&gt;Use weather data to decide next tool&lt;/li&gt;
&lt;li&gt;Call cipher tool with extracted params&lt;/li&gt;
&lt;li&gt;Build final response with both results&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Managing this flow, tracking conversation history, and rebuilding context each turn is complex.&lt;/p&gt;

&lt;p&gt;Solution: Treat each turn as a transaction - load state, process, update, persist.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Debugging Moment
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Agent called tools repeatedly in a loop, making 10+ API calls for one user message.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I tried:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Added rate limiting → Still looped&lt;/li&gt;
&lt;li&gt;Changed prompt to discourage multiple calls → Still looped&lt;/li&gt;
&lt;li&gt;Inspected tool call sequence → Found the issue ✓&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Root cause:&lt;/strong&gt;&lt;br&gt;
The conversation history included tool results, but the LLM kept "forgetting" it had already called the tool. The messages weren't formatted correctly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Wrong - LLM doesn't recognize tool result
&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tool result: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Right - Proper tool result format
&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool_call_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After fixing the message format, the agent called each tool exactly once.&lt;/p&gt;

&lt;p&gt;Like debugging distributed systems: inspect the protocol, not just the logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mistakes I Made
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Didn't handle tool execution failures&lt;/strong&gt;&lt;br&gt;
What if the weather API is down? What if the cipher has invalid input?&lt;/p&gt;

&lt;p&gt;Should have: Wrapped tool calls in try-catch, return graceful errors to LLM.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. No token tracking&lt;/strong&gt;&lt;br&gt;
Multi-turn conversations + tool results = token count explosion.&lt;br&gt;
Hit 16K context limit after 15 turns.&lt;/p&gt;

&lt;p&gt;Fix: Implement conversation pruning - keep recent messages, summarize old ones.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Built tools without testing independently&lt;/strong&gt;&lt;br&gt;
Integrated everything at once, couldn't tell if bugs were in tools or orchestration.&lt;/p&gt;

&lt;p&gt;Fix: Unit test each tool, then integration test the agent loop.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Forgot to persist state between server restarts&lt;/strong&gt;&lt;br&gt;
Agent lost all context when Flask restarted during development.&lt;/p&gt;

&lt;p&gt;Fix: Save state to JSON after each turn, load on startup.&lt;/p&gt;
&lt;h2&gt;
  
  
  Connection to Distributed Systems
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Agents = Microservice orchestration:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tool calling     = Service-to-service RPC
State management = Distributed transactions
Retry logic      = Fault tolerance
Phase tracking   = Workflow orchestration (Temporal, Airflow)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;RAG = Multi-tier caching:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Vector DB    = L1 cache (~10ms)
Doc Store    = L2 cache (~50ms)  
LLM          = Compute (~500ms)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Conversation history = Event sourcing:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every message = Event&lt;/li&gt;
&lt;li&gt;Rebuild state by replaying events&lt;/li&gt;
&lt;li&gt;Can replay conversation from any point&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;MCP = Service mesh:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Standard protocol&lt;/li&gt;
&lt;li&gt;Service discovery&lt;/li&gt;
&lt;li&gt;Centralized monitoring&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Time Spent This Week
&lt;/h2&gt;

&lt;p&gt;~12 hours&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'm Taking Forward
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Agents are about orchestration, not intelligence.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The LLM provides reasoning. System design matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which tools to expose&lt;/li&gt;
&lt;li&gt;How to handle tool failures&lt;/li&gt;
&lt;li&gt;When to advance workflow phases&lt;/li&gt;
&lt;li&gt;How to maintain conversation context&lt;/li&gt;
&lt;li&gt;When to prune history vs persist state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same principles as designing microservices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool calling is RPC with natural language.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Microservices: service_a.call(service_b.get_data(params))
AI Agents:     llm.call(weather_tool.get_data(city="Berlin"))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM is the orchestrator, deciding which services to call and when.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;State management is the hard part.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not the LLM, not the tools - managing state across:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple turns&lt;/li&gt;
&lt;li&gt;Multiple tool calls per turn&lt;/li&gt;
&lt;li&gt;Phase transitions&lt;/li&gt;
&lt;li&gt;Server restarts&lt;/li&gt;
&lt;li&gt;Concurrent sessions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is distributed systems 101.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RAG is a data pipeline, not magic.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Pipeline stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Ingest → 2. Chunk → 3. Embed → 4. Index → 5. Retrieve → 6. Generate&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each stage needs tuning. Same as any distributed system.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Still Hard
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Token management (conversation + tool results + context)&lt;/li&gt;
&lt;li&gt;Multi-hop reasoning (questions needing multiple retrievals)&lt;/li&gt;
&lt;li&gt;Tool selection (when to call vs when to answer directly)&lt;/li&gt;
&lt;li&gt;Error propagation (how to surface tool failures to LLM)&lt;/li&gt;
&lt;li&gt;Cost optimization (each tool call + LLM call costs money)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Approach That Works
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Start simple:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Single tool, single turn&lt;/li&gt;
&lt;li&gt;Add multi-turn conversation&lt;/li&gt;
&lt;li&gt;Add multiple tools&lt;/li&gt;
&lt;li&gt;Add state management&lt;/li&gt;
&lt;li&gt;Add phase orchestration&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each step works before adding complexity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test components independently:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tool execution (unit tests)&lt;/li&gt;
&lt;li&gt;LLM tool calling (integration tests)&lt;/li&gt;
&lt;li&gt;State persistence (end-to-end tests)&lt;/li&gt;
&lt;li&gt;Then full agent loop&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same debugging process I use for microservices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Log everything:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-12T10:32:00Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"user_input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Weather in Berlin?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tools_called"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"tool"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"get_weather"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"city"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Berlin"&lt;/span&gt;&lt;span class="p"&gt;}}],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tool_results"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"4°C, overcast"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agent_response"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Berlin is 4°C..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"phase"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"preparation"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When debugging, replay the conversation. Just like distributed tracing.&lt;/p&gt;

&lt;p&gt;Week 5 down. Built RAG systems and agents that work with real data and real workflows.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Building AI systems? What distributed systems patterns have you applied?&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>agenticsystems</category>
    </item>
    <item>
      <title>Week 4: From Theory to Training - My First Neural Networks</title>
      <dc:creator>Raju C</dc:creator>
      <pubDate>Sun, 29 Mar 2026 03:47:31 +0000</pubDate>
      <link>https://dev.to/raju_ch_0f28d/week-4-from-theory-to-training-my-first-neural-networks-1mk3</link>
      <guid>https://dev.to/raju_ch_0f28d/week-4-from-theory-to-training-my-first-neural-networks-1mk3</guid>
      <description>&lt;p&gt;Week 4 done.&lt;/p&gt;

&lt;p&gt;Last week: Shallow algorithms (Linear Regression, Logistic Regression).&lt;br&gt;
This week: Neural networks - actually building and training them.&lt;/p&gt;

&lt;p&gt;Still not LLMs. Still not ChatGPT integrations. Still "boring" ML.&lt;/p&gt;

&lt;p&gt;But here's why: &lt;strong&gt;I want to understand what's actually happening, not just call APIs.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The difference? Last week I learned &lt;em&gt;what&lt;/em&gt; models predict.&lt;br&gt;
This week I learned &lt;em&gt;how&lt;/em&gt; they learn.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Shift: From Equations to Architectures
&lt;/h2&gt;

&lt;p&gt;This week was about understanding when complexity is worth it.&lt;/p&gt;
&lt;h2&gt;
  
  
  What I Actually Built
&lt;/h2&gt;
&lt;h3&gt;
  
  
  1. Handwritten Digit Recognition (MNIST)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The problem:&lt;/strong&gt; Recognize handwritten digits (0-9) from 28×28 pixel images.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;torch.nn&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DigitClassifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Module&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;layers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Sequential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Flatten&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;784&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ReLU&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Dropout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;nn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Linear&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;forward&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;layers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Result: 97% accuracy
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In distributed systems, we build pipelines: &lt;code&gt;Data → Transform → Store → Serve&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Neural networks are the same: &lt;code&gt;Input → Hidden Layers → Output&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The transformation happens through learning, not hardcoding rules.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Experimenting with Layer Architectures
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Dropout Layers&lt;/strong&gt; - Forces redundant representations. Like fault-tolerant systems - if one node fails, others handle the load.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Convolutional Layers&lt;/strong&gt; - Respects spatial structure. Same filter slides across the image (parameter sharing). Like using the same load balancing algorithm across all services.&lt;/p&gt;

&lt;p&gt;Dense layers: 97% accuracy → Conv layers: 99.2% accuracy&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BatchNorm&lt;/strong&gt; - Stabilizes training by normalizing inputs to each layer. Like circuit breakers in microservices.&lt;/p&gt;

&lt;p&gt;Without BatchNorm: Stuck at 85% → With BatchNorm: 97%&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Image Denoising with Autoencoders
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The architecture:&lt;/strong&gt; &lt;code&gt;Noisy Image → [Encoder] → Bottleneck → [Decoder] → Clean Image&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The bottleneck (32 dimensions) is key:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Too large (128): Memorizes noise&lt;/li&gt;
&lt;li&gt;Too small (8): Loses detail&lt;/li&gt;
&lt;li&gt;Just right (32): Learns what matters ✓&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is lossy compression with learned parameters. Like designing a caching layer - except the model learns the patterns.&lt;/p&gt;

&lt;p&gt;Extra: In-painting (reconstructing obscured regions). Simple digits: 90% success, Complex digits: 60% success.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Frustrated Me Most
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Hyperparameter hell.&lt;/strong&gt; Every decision affects everything - learning rate, layers, dropout, bottleneck size. I've spent years tuning JVM heaps and thread pools. This feels similar but with 10x more knobs.&lt;/p&gt;

&lt;p&gt;Solution: Start with known-good defaults. Change one thing at a time. Keep notes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Debugging Moment
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; Autoencoder producing blurry reconstructions despite loss decreasing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Changed optimizer from SGD to Adam.&lt;/p&gt;

&lt;p&gt;Adam adapts learning rates per parameter. SGD uses same rate for everything. Like auto-scaling different services based on their individual load patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mistakes I Made
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Tested on training data&lt;/strong&gt; (again!)&lt;br&gt;
Should know: Always test on unseen data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Forgot model.eval()&lt;/strong&gt;&lt;br&gt;
Dropout was randomly disabling neurons during testing!&lt;br&gt;
Training mode: 82% → Eval mode: 97%&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Picked architecture randomly&lt;/strong&gt;&lt;br&gt;
5 layers? 256 neurons? Dropout 0.8? Model barely learned.&lt;br&gt;
Fix: Started with proven architectures (LeNet), modified incrementally.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Didn't normalize input data&lt;/strong&gt;&lt;br&gt;
Raw pixels (0-255): unstable, loss exploding&lt;br&gt;
Normalized (0-1): stable, converging ✓&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern Recognition
&lt;/h2&gt;

&lt;p&gt;What I've learned leading teams applies here:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start simple, add complexity only when needed&lt;/strong&gt;&lt;br&gt;
Basic: 92% → +dropout: 95% → +BatchNorm: 97% → +Conv: 99%&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Understand trade-offs&lt;/strong&gt;&lt;br&gt;
More layers = more capacity = slower training = higher overfitting risk&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Experiment systematically&lt;/strong&gt;&lt;br&gt;
Bottleneck sizes: 8 (blurry) → 16 (better) → 32 (clean ✓) → 64 (memorizing) → 128 (overfitting)&lt;/p&gt;

&lt;h2&gt;
  
  
  Connection to Distributed Systems
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Encoder-Decoder = Data Pipeline&lt;/strong&gt;&lt;br&gt;
The bottleneck is like network bandwidth - compress to fit through it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Parameter Sharing = Code Reuse&lt;/strong&gt;&lt;br&gt;
One "edge detector" works everywhere. Efficient and effective.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Overfitting = Over-optimization&lt;/strong&gt;&lt;br&gt;
I've seen systems optimized for one traffic pattern that broke when patterns changed.&lt;br&gt;
Solution: regularization / graceful degradation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Time Spent This Week
&lt;/h2&gt;

&lt;p&gt;About 8-10 hours this week.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'm Taking Forward
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Neural networks aren't a stepping stone to "real" AI. They ARE real AI.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most production ML uses these techniques. LLMs get the hype. But understanding backpropagation, gradients, and optimization matters.&lt;/p&gt;

&lt;p&gt;I could be building LLM wrappers right now. But I wouldn't understand:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How training actually works&lt;/li&gt;
&lt;li&gt;Why models fail in specific ways&lt;/li&gt;
&lt;li&gt;When to use what architecture&lt;/li&gt;
&lt;li&gt;How to debug learning problems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Starting with fundamentals means I can build real intuition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The approach that works:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start with fundamentals. Build something small (like MNIST). Don't start with "I'm going to build ChatGPT." Master the basics. Understand debugging. Build intuition. Then scale up.&lt;/p&gt;

&lt;p&gt;Same advice I give for learning any new tech stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Still Hard
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Choosing architecture (Conv vs Dense? How many layers?)&lt;/li&gt;
&lt;li&gt;Hyperparameter tuning (still trial and error)&lt;/li&gt;
&lt;li&gt;Knowing when to stop (97% vs 99%?)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These feel like architectural decisions I make daily - but with less intuition.&lt;/p&gt;

&lt;p&gt;Week 4 down. Built neural networks that actually learn.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Learning deep learning as a senior engineer? What surprised you most about the transition?&lt;/em&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>python</category>
      <category>ai</category>
      <category>learning</category>
    </item>
    <item>
      <title>Week 3: Why I'm Learning 'Boring' ML Before Building with LLMs</title>
      <dc:creator>Raju C</dc:creator>
      <pubDate>Sun, 22 Mar 2026 14:33:15 +0000</pubDate>
      <link>https://dev.to/raju_ch_0f28d/week-3-why-im-learning-boring-ml-before-building-with-llms-pml</link>
      <guid>https://dev.to/raju_ch_0f28d/week-3-why-im-learning-boring-ml-before-building-with-llms-pml</guid>
      <description>&lt;p&gt;Week 3 done.&lt;/p&gt;

&lt;p&gt;This week I learned shallow algorithms - Linear Regression, Logistic Regression, DBSCAN, PCA.&lt;/p&gt;

&lt;p&gt;Not LLMs. Not ChatGPT integrations. Not the AI applications everyone's building.&lt;/p&gt;

&lt;p&gt;Basic machine learning algorithms from decades ago.&lt;/p&gt;

&lt;p&gt;And I kept asking myself: why am I doing this?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Question I Keep Getting
&lt;/h2&gt;

&lt;p&gt;"You're learning AI, right? When are you building something with GPT or Claude?"&lt;/p&gt;

&lt;p&gt;Fair question. I could skip straight to LLM applications. Plenty of people do.&lt;/p&gt;

&lt;p&gt;But here's what I realized this week: &lt;strong&gt;I want to understand what's actually happening, not just call APIs.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Shallow Algorithms First
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. They're what's actually running in production&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most companies aren't running massive neural networks. They're running:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Logistic regression for fraud detection&lt;/li&gt;
&lt;li&gt;Linear regression for demand forecasting
&lt;/li&gt;
&lt;li&gt;Clustering for customer segmentation&lt;/li&gt;
&lt;li&gt;PCA for feature reduction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The "boring" algorithms power real systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. They teach you how ML actually works&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When I call an LLM API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'm using ML. I'm not understanding ML.&lt;/p&gt;

&lt;p&gt;When I implement Linear Regression:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sklearn.linear_model&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LinearRegression&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LinearRegression&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I see: training data → learning patterns → making predictions.&lt;/p&gt;

&lt;p&gt;It's the same process neural networks use, just simpler.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. I can actually debug them&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If my Linear Regression model performs poorly, I can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check the features&lt;/li&gt;
&lt;li&gt;Look at the coefficients&lt;/li&gt;
&lt;li&gt;Understand what's being weighted&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If my LLM call gives weird results? I have no idea what's happening inside.&lt;/p&gt;

&lt;p&gt;Starting simple means I can build intuition before jumping to black boxes.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Actually Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Linear Regression&lt;/strong&gt; - Predicting house prices from features (square feet, bedrooms, age)&lt;/p&gt;

&lt;p&gt;The model learns: &lt;code&gt;price = w1×sqft + w2×bedrooms + w3×age + bias&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;It finds the best weights (w1, w2, w3) by minimizing prediction error.&lt;/p&gt;

&lt;p&gt;This clicked because I've spent years optimizing systems. Same concept - iteratively adjust parameters to minimize error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Logistic Regression&lt;/strong&gt; - Classifying patients as healthy vs disease&lt;/p&gt;

&lt;p&gt;Despite the name, it's for classification, not regression. This confused me for days.&lt;/p&gt;

&lt;p&gt;It outputs probabilities (0 to 1). Above 0.5 → disease, below → healthy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DBSCAN&lt;/strong&gt; - Grouping similar pixels in images&lt;/p&gt;

&lt;p&gt;Clusters dense regions automatically. No need to specify number of clusters upfront.&lt;/p&gt;

&lt;p&gt;Reminded me of finding hot spots in distributed systems - same density-based grouping concept.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PCA&lt;/strong&gt; - Reducing 100 features down to 10&lt;/p&gt;

&lt;p&gt;Keeps the most important information, throws away the noise.&lt;/p&gt;

&lt;p&gt;Like compressing data in a pipeline - lose some detail but keep what matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Part That Frustrated Me
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Hyperparameter tuning.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every algorithm has knobs to turn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;DBSCAN: How close is "similar"? How many points make a cluster?&lt;/li&gt;
&lt;li&gt;PCA: How many components to keep?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The examples work fine. My own experiments? Trial and error.&lt;/p&gt;

&lt;p&gt;I tried clustering an image and got either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Everything in one giant cluster (threshold too loose)&lt;/li&gt;
&lt;li&gt;Everything labeled as noise (threshold too strict)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Still figuring out the intuition here.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mistakes I Made
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Tested on training data&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Train on all data
&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Test on same data
# Score: 98%! Amazing!
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Except the model had already seen the answers. Not a real test.&lt;/p&gt;

&lt;p&gt;Should have split train/test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Score: 73%. More honest.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Coming from software engineering where we have staging environments, I should have known better.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Mixed up regression and classification&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I kept using Linear Regression when I should've used Logistic Regression.&lt;/p&gt;

&lt;p&gt;Finally internalized:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Predicting a number (price, temperature, age) → Regression&lt;/li&gt;
&lt;li&gt;Predicting a category (yes/no, cat/dog, disease) → Classification&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Took more failed experiments than I'd like to admit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Forgot to scale features&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Features with wildly different scales
&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;...]&lt;/span&gt;  &lt;span class="c1"&gt;# square_feet=2000, bedrooms=3
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Square footage dominates because the numbers are bigger. Had to normalize everything to the same scale first.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pattern That Helped
&lt;/h2&gt;

&lt;p&gt;Every scikit-learn algorithm follows the same structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SomeAlgorithm&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_train&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Learn from data
&lt;/span&gt;&lt;span class="n"&gt;predictions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;predict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Make predictions
&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X_test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y_test&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Evaluate
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once I saw this pattern, experimenting with new algorithms got easier.&lt;/p&gt;

&lt;p&gt;Want to try a different classifier? Swap the algorithm. Same interface.&lt;/p&gt;

&lt;p&gt;Reminded me of how Kafka, Flink, and other stream tools have different internals but similar APIs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connection to Distributed Systems
&lt;/h2&gt;

&lt;p&gt;Gradient descent (how these models learn) works like load balancer tuning:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Load balancing:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Try a configuration&lt;/li&gt;
&lt;li&gt;Measure performance&lt;/li&gt;
&lt;li&gt;Adjust based on results&lt;/li&gt;
&lt;li&gt;Repeat&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Machine learning:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Make predictions&lt;/li&gt;
&lt;li&gt;Measure errors&lt;/li&gt;
&lt;li&gt;Adjust weights based on errors&lt;/li&gt;
&lt;li&gt;Repeat&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Same iterative optimization. Different domain.&lt;/p&gt;

&lt;p&gt;This mental model helped when ML concepts felt foreign.&lt;/p&gt;

&lt;h2&gt;
  
  
  Time Spent This Week
&lt;/h2&gt;

&lt;p&gt;About 8-10 hours this week.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'm Taking Forward
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Shallow algorithms aren't a stepping stone to "real" ML. They ARE real ML.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most production systems use these techniques. Neural networks get the hype. Logistic regression gets deployed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Understanding fundamentals before jumping to LLMs makes sense.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I could be building GPT wrappers right now. But I wouldn't understand:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How training works&lt;/li&gt;
&lt;li&gt;Why models fail&lt;/li&gt;
&lt;li&gt;When to use what approach&lt;/li&gt;
&lt;li&gt;How to debug problems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Starting simple means I can build intuition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You can be productive without understanding everything.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I can use these algorithms effectively even if I don't fully grasp every mathematical detail.&lt;/p&gt;

&lt;p&gt;Understanding deepens with practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Still Unclear
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Picking the right algorithm for a new problem (I Google this constantly)&lt;/li&gt;
&lt;li&gt;Tuning hyperparameters systematically (still trial and error)&lt;/li&gt;
&lt;li&gt;Knowing when a model is "good enough"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I'm three weeks in, not three years. Still learning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;In a few weeks, I'll start building LLM applications. RAG systems, agents, whatever.&lt;/p&gt;

&lt;p&gt;But I'll understand:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What "training" means&lt;/li&gt;
&lt;li&gt;How models learn patterns&lt;/li&gt;
&lt;li&gt;Why evaluation matters&lt;/li&gt;
&lt;li&gt;When simpler approaches work better&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I won't just be calling APIs. I'll understand what those APIs are doing under the hood.&lt;/p&gt;

&lt;p&gt;That's worth spending time on "boring" algorithms.&lt;/p&gt;

&lt;p&gt;Week 3 down. Built and broke ML models this week.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Learning ML fundamentals before diving into LLMs? Or went straight to GPT APIs? Curious what path others are taking.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>python</category>
      <category>ai</category>
      <category>learning</category>
    </item>
    <item>
      <title>Week 2: Python Essentials and My First AI/ML Concepts</title>
      <dc:creator>Raju C</dc:creator>
      <pubDate>Sun, 15 Mar 2026 15:28:02 +0000</pubDate>
      <link>https://dev.to/raju_ch_0f28d/week-2-python-essentials-and-my-first-aiml-concepts-4l2c</link>
      <guid>https://dev.to/raju_ch_0f28d/week-2-python-essentials-and-my-first-aiml-concepts-4l2c</guid>
      <description>&lt;p&gt;Week 2 done.&lt;/p&gt;

&lt;p&gt;This week wasn't about fancy ML models or neural networks.&lt;/p&gt;

&lt;p&gt;It was about something more fundamental: &lt;strong&gt;building the Python muscle I'll need to debug AI code, even when Claude Code writes most of it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;New here? Read &lt;a href="https://dev.to/raju_ch_0f28d/why-im-leaving-my-comfort-zone-staff-engineer-ai-1h70"&gt;Week 1: Why I'm making this transition&lt;/a&gt; first.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Actually Did
&lt;/h2&gt;

&lt;p&gt;This week I focused on the tools, not the theory:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Python fundamentals:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;NumPy for numerical operations&lt;/li&gt;
&lt;li&gt;Pandas for data manipulation&lt;/li&gt;
&lt;li&gt;Matplotlib for visualizations&lt;/li&gt;
&lt;li&gt;Jupyter notebooks as my workspace&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;AI/ML concepts I started exploring:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Embeddings (turning text into numbers)&lt;/li&gt;
&lt;li&gt;Prompt engineering basics&lt;/li&gt;
&lt;li&gt;Tool calling with LLMs&lt;/li&gt;
&lt;li&gt;Basic LLM API calls through notebooks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not glamorous. But necessary.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Realization: I Need to Read AI Code, Not Just Generate It
&lt;/h2&gt;

&lt;p&gt;Here's what hit me this week.&lt;/p&gt;

&lt;p&gt;I've been using Claude Code to build POCs 2-3x faster. That's great.&lt;/p&gt;

&lt;p&gt;But when something breaks? When the generated code doesn't do what I expect? When I need to understand WHY it works?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I need to read Python.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not just generate it. Not just copy-paste it. Actually understand what's happening.&lt;/p&gt;

&lt;p&gt;Coming from 18 years in other languages, I could've skipped Python basics. "It's just another language, I'll figure it out."&lt;/p&gt;

&lt;p&gt;Bad idea.&lt;/p&gt;

&lt;h2&gt;
  
  
  Jupyter Notebooks: I Was Wrong
&lt;/h2&gt;

&lt;p&gt;I was skeptical.&lt;/p&gt;

&lt;p&gt;"Why not just use .py files like a normal engineer?"&lt;/p&gt;

&lt;p&gt;Then I actually used notebooks for a week.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What clicked:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Write code → Run cell → See output immediately.&lt;/p&gt;

&lt;p&gt;No "run entire script and wait."&lt;br&gt;
No "add print statements everywhere to debug."&lt;br&gt;
No "recompile everything for one change."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Cell 1: Load data
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data.csv&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;head&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# See it immediately
&lt;/span&gt;
&lt;span class="c1"&gt;# Cell 2: Try something
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;column&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Instant feedback
&lt;/span&gt;
&lt;span class="c1"&gt;# Cell 3: Visualize
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;column&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;hist&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Plot appears inline
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For &lt;strong&gt;exploration&lt;/strong&gt;, this is perfect.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;production code&lt;/strong&gt;, I'll still use .py files.&lt;/p&gt;

&lt;p&gt;But for learning ML? Notebooks make sense.&lt;/p&gt;

&lt;h2&gt;
  
  
  NumPy, Pandas, Matplotlib: The Foundation
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;NumPy&lt;/strong&gt; - operations on arrays of numbers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;

&lt;span class="c1"&gt;# Instead of loops
&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;doubled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;  &lt;span class="c1"&gt;# [2, 4, 6, 8, 10]
&lt;/span&gt;
&lt;span class="c1"&gt;# Fast. Clean. No explicit loops.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the foundation. Every ML library uses NumPy under the hood.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pandas&lt;/strong&gt; - data manipulation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;

&lt;span class="c1"&gt;# Like SQL, but in Python
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;age&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;35&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;salary&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;50000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;60000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;70000&lt;/span&gt;&lt;span class="p"&gt;]})&lt;/span&gt;
&lt;span class="n"&gt;high_earners&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;salary&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;55000&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After years of SQL and data pipelines, Pandas clicked fast.&lt;/p&gt;

&lt;p&gt;It's what feeds data into ML models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Matplotlib&lt;/strong&gt; - seeing what the data looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;matplotlib.pyplot&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;plt&lt;/span&gt;

&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;scatter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;age&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;salary&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xlabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Age&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ylabel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Salary&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;show&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Turns out &lt;strong&gt;visualizing data isn't optional&lt;/strong&gt; in ML. It's how you understand what's actually happening.&lt;/p&gt;

&lt;h2&gt;
  
  
  Starting to Understand AI Concepts
&lt;/h2&gt;

&lt;p&gt;This week I dipped into actual AI/ML concepts:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Embeddings:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The idea: convert text into numbers (vectors) that capture meaning.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sentence_transformers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SentenceTransformer&lt;/span&gt;

&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;texts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;distributed systems&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;microservices&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;machine learning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;texts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Each text is now a 384-dimensional vector
# Similar meanings = similar vectors
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I can USE this now. Do I understand HOW the model creates these vectors?&lt;/p&gt;

&lt;p&gt;Not yet. That's future weeks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt Engineering &amp;amp; Tool Calling:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Started experimenting with LLM APIs in notebooks.&lt;/p&gt;

&lt;p&gt;Followed OpenAI's documentation examples to make basic API calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-3.5-turbo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Explain embeddings simply&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not building anything production-ready yet. Just understanding the mechanics.&lt;/p&gt;

&lt;p&gt;But this shift - from using AI through a chat interface to calling it programmatically - matters.&lt;/p&gt;

&lt;p&gt;This is the foundation of &lt;strong&gt;building&lt;/strong&gt; AI applications, not just using them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Still Fuzzy
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;How embeddings actually work:&lt;/strong&gt; I can use them. Don't understand the training process yet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mathematical foundations:&lt;/strong&gt; Linear algebra, probability - I know I need these. Haven't dived deep yet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When to use what:&lt;/strong&gt; Lots of ML concepts flying around. Don't have mental models yet for "when would I use X vs Y?"&lt;/p&gt;

&lt;p&gt;That's okay. It's week 2.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Claude Code Insight
&lt;/h2&gt;

&lt;p&gt;Here's why this week mattered.&lt;/p&gt;

&lt;p&gt;Claude Code generates Python. Fast.&lt;/p&gt;

&lt;p&gt;But when that code uses NumPy array slicing I don't understand? Or Pandas operations I've never seen? Or tries to fix a bug I can't diagnose?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I'm stuck.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This week was about building enough Python muscle to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read what Claude generates&lt;/li&gt;
&lt;li&gt;Understand what it's doing&lt;/li&gt;
&lt;li&gt;Debug when it's wrong&lt;/li&gt;
&lt;li&gt;Modify when I need something different&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not about becoming a Python expert.&lt;/p&gt;

&lt;p&gt;About being &lt;strong&gt;competent enough to work WITH the AI tools&lt;/strong&gt;, not just be dependent on them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connection to My Background
&lt;/h2&gt;

&lt;p&gt;Something I noticed: &lt;strong&gt;ML data preparation is just ETL.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Extract, Transform, Load - but for models instead of databases.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Load data (Extract)
&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_csv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data.csv&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Clean and transform (Transform)
&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dropna&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;new_feature&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;col1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;col2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Feed to model (Load)
&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I've spent 18 years building large-scale distributed systems, streaming APIs, backend infrastructure.&lt;/p&gt;

&lt;p&gt;The last few years leading teams building data pipelines.&lt;/p&gt;

&lt;p&gt;ML preprocessing? It's the same ETL pattern I know. Different destination.&lt;/p&gt;

&lt;p&gt;That helped it click.&lt;/p&gt;

&lt;h2&gt;
  
  
  Time Investment
&lt;/h2&gt;

&lt;p&gt;This week: ~12 hours&lt;/p&gt;

&lt;p&gt;Mostly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Practicing Python in notebooks&lt;/li&gt;
&lt;li&gt;Working through NumPy/Pandas basics&lt;/li&gt;
&lt;li&gt;Playing with embeddings&lt;/li&gt;
&lt;li&gt;First LLM API calls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;More hands-on than Week 1. Less "watching tutorials," more "writing code and breaking things."&lt;/p&gt;

&lt;p&gt;Week 2 down. Building the muscle I'll actually need.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're also learning Python for AI/ML - what tripped you up coming from other languages?&lt;/em&gt;&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;


---
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>machinelearning</category>
      <category>python</category>
      <category>ai</category>
      <category>learning</category>
    </item>
    <item>
      <title>Why I'm Leaving My Comfort Zone: From Engineering Leadership to AI-First Engineering</title>
      <dc:creator>Raju C</dc:creator>
      <pubDate>Sat, 07 Mar 2026 17:28:29 +0000</pubDate>
      <link>https://dev.to/raju_ch_0f28d/why-im-leaving-my-comfort-zone-staff-engineer-ai-1h70</link>
      <guid>https://dev.to/raju_ch_0f28d/why-im-leaving-my-comfort-zone-staff-engineer-ai-1h70</guid>
      <description>&lt;p&gt;I've spent my career architecting distributed systems — designing fault-tolerant pipelines, making trade-offs between consistency and availability, and owning systems end-to-end from design through production.&lt;/p&gt;

&lt;p&gt;Data pipelines. Streaming infrastructure. Backend at scale. I've built systems processing terabytes of data, architected platforms handling millions of requests.&lt;/p&gt;

&lt;p&gt;I'm good at what I do.&lt;/p&gt;

&lt;p&gt;So why am I starting over as a complete beginner in AI/ML?&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happened
&lt;/h2&gt;

&lt;p&gt;Two things made this impossible to ignore.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First, my company started hiring AI/ML engineers.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Suddenly, there were these people in meetings talking about RAG, agentic systems, and MCP, etc., and I'd nod along.&lt;/p&gt;

&lt;p&gt;I had no idea what they were actually building.&lt;/p&gt;

&lt;p&gt;All my experience, and I couldn't contribute to the most important projects at my company.&lt;/p&gt;

&lt;p&gt;That hurt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Second, I started using Claude Code.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Game changer. POCs that took me days? Done in hours. New features? 2-3x faster. Going from 0 to 1 on projects became almost effortless.&lt;/p&gt;

&lt;p&gt;But here's the problem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I didn't understand how it worked.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I was using AI. I wasn't building it. Couldn't explain it. Couldn't tell if solutions were actually good or just looked convincing.&lt;/p&gt;

&lt;p&gt;Someone asked me: "How does Claude Code actually work?"&lt;/p&gt;

&lt;p&gt;No answer.&lt;/p&gt;

&lt;p&gt;That's when it hit me.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters to Me
&lt;/h2&gt;

&lt;p&gt;I've always believed &lt;strong&gt;understanding fundamentals lets you solve complex problems.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When I learned distributed systems - really learned them, not just used Kafka but understood partitioning, replication, consensus - that's when I stopped using tools and started building systems.&lt;/p&gt;

&lt;p&gt;That's when I became valuable.&lt;/p&gt;

&lt;p&gt;I need to do the same with AI.&lt;/p&gt;

&lt;p&gt;Not just use it. Understand it. Build it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Scares Me
&lt;/h2&gt;

&lt;p&gt;Transitioning from expert to beginner after leading distributed systems teams.&lt;br&gt;
I'm back to asking foundational questions.&lt;/p&gt;

&lt;p&gt;I should've started this in 2024. Every month I waited, AI moved faster.&lt;br&gt;
But waiting for the "perfect time" would mean never starting.&lt;/p&gt;

&lt;p&gt;Here I am.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Pulls Me Forward
&lt;/h2&gt;

&lt;p&gt;Two things make this worth the fear:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I want to be part of the conversation.&lt;/strong&gt; Not the person nodding along while AI/ML engineers talk. I want to understand what they're building.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I want to build AI systems myself.&lt;/strong&gt; Not just use Claude Code. Build things like it. Understand models, architectures, trade-offs.&lt;/p&gt;

&lt;p&gt;I want to become an AI-first engineer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Now and Not Later
&lt;/h2&gt;

&lt;p&gt;Because I can't afford to fall further behind.&lt;/p&gt;

&lt;p&gt;AI is accelerating too fast. Every day I wait, the gap widens.&lt;/p&gt;

&lt;p&gt;I already feel late. Another year won't make this easier.&lt;/p&gt;

&lt;p&gt;So I'm starting today.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Plan
&lt;/h2&gt;

&lt;p&gt;I'm learning AI/ML fundamentals from the ground up.&lt;/p&gt;

&lt;p&gt;No shortcuts. No just-use-the-framework-without-understanding approach.&lt;/p&gt;

&lt;p&gt;I want to understand:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How models like Claude actually work&lt;/li&gt;
&lt;li&gt;How to build AI systems from scratch&lt;/li&gt;
&lt;li&gt;How to architect production AI solutions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Timeline?&lt;/strong&gt; Don't have one. Just committed to the journey.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Goal?&lt;/strong&gt; However long it takes - I want to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lead or contribute to AI projects&lt;/li&gt;
&lt;li&gt;Understand how these models actually work&lt;/li&gt;
&lt;li&gt;Become a true AI-first engineer&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;I'm documenting this journey here. Not polished tutorials. Real learning in public.&lt;/p&gt;

&lt;p&gt;Wins. Struggles. Confusion. Breakthroughs.&lt;/p&gt;

&lt;p&gt;Week 1 starts now.&lt;/p&gt;

&lt;p&gt;I'm nervous. I'm excited. I'm deep into my career and starting over.&lt;/p&gt;

&lt;p&gt;Let's see where this goes.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're also making a career transition into AI/ML, I'd love to hear about it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>career</category>
      <category>ai</category>
      <category>learning</category>
    </item>
  </channel>
</rss>
