<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Suraj Bera</title>
    <description>The latest articles on DEV Community by Suraj Bera (@suraj_bera).</description>
    <link>https://dev.to/suraj_bera</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3742312%2F50e1edc8-9951-4dd3-bdbe-e803686f646d.png</url>
      <title>DEV Community: Suraj Bera</title>
      <link>https://dev.to/suraj_bera</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/suraj_bera"/>
    <language>en</language>
    <item>
      <title># Day 5 of learning AI Engineering: built a small RAG app over a PDF</title>
      <dc:creator>Suraj Bera</dc:creator>
      <pubDate>Tue, 19 May 2026 18:47:31 +0000</pubDate>
      <link>https://dev.to/suraj_bera/-day-5-of-learning-ai-engineering-built-a-small-rag-app-over-a-pdf-j7p</link>
      <guid>https://dev.to/suraj_bera/-day-5-of-learning-ai-engineering-built-a-small-rag-app-over-a-pdf-j7p</guid>
      <description>&lt;p&gt;I built a small RAG (Retrieval Augmented Generation) project where a user can ask questions from a PDF, and the LLM answers from that PDF along with the page number to look at. The stack is LangChain, OpenAI embeddings, and Qdrant running in Docker.&lt;/p&gt;

&lt;p&gt;A small note before we start: this exact same pipeline is what powers web-apps like an "AI Tutor in Educative", an "AI web page builder". The only thing that changes between those products and my PDF Q&amp;amp;A is the &lt;strong&gt;data source&lt;/strong&gt;. That is the key idea to take away.&lt;/p&gt;

&lt;h2&gt;
  
  
  What RAG is, in one line
&lt;/h2&gt;

&lt;p&gt;Take a document → break it into small chunks → turn each chunk into a vector (a list of numbers) → store those vectors in a database. Later, when the user asks a question, turn the question into a vector too, find the closest chunks, and feed them to an LLM as context.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INDEXING (run once)
───────────────────

   PDF file
      │
      ▼
   PyPDFLoader
      │  one Document per page (text + metadata)
      ▼
   RecursiveCharacterTextSplitter
      │  chunk_size = 1000, overlap = 200
      ▼
   ~200 small text chunks
      │
      ▼
   OpenAIEmbeddings (text-embedding-3-small)
      │  each chunk becomes a 1536-dim vector
      ▼
   Qdrant (running in Docker)
      │  vector + chunk text + page metadata stored


QUERY (every user question)
───────────────────────────

   "explain me about variables"
      │
      ▼
   OpenAIEmbeddings
      │  same model, same vector space
      ▼
   Query vector (1536-dim)
      │
      ▼
   Qdrant similarity_search (cosine, top k = 4)
      │  closest 4 chunks come back
      ▼
   Build a prompt with those chunks as context
      │
      ▼
   OpenAI Chat Completion (gpt-5-nano)
      │
      ▼
   Final answer + page citations
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The stack and why I picked it
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LangChain&lt;/strong&gt; is the glue. It gives me one interface to talk to many providers. If I want to swap OpenAI for Cohere, or Qdrant for Pinecone, I change one line. It also has loaders for PDFs, websites, Notion, Google Docs, CSVs, image files, and a lot more. Plus tools for chains, agents, memory, prompts, and output parsing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Qdrant&lt;/strong&gt; is the vector database. I ran it locally in Docker so I don't have to pay for a managed service while learning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI &lt;code&gt;text-embedding-3-small&lt;/code&gt;&lt;/strong&gt; is the embedding model. More on why this one and not the large one further down.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;uv&lt;/strong&gt; instead of pip. Faster, lockfile-based, and the modern Python experience.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Some things worth pointing out:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The embedding model is just a client — no API call happens when you create it. The actual call to OpenAI happens inside &lt;code&gt;from_documents&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;from_documents&lt;/code&gt; is the convenience method. It connects to Qdrant, creates the collection, embeds every chunk, and inserts it. One call, three jobs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Now the real question — how big a PDF can I actually ingest?
&lt;/h2&gt;

&lt;p&gt;This is the part I couldn't find a straight answer to in any tutorial. The pipeline above worked beautifully for my 123 KB, 71-page PDF. But what about a 100 MB book? A 1 GB legal document dump?&lt;/p&gt;

&lt;p&gt;The answer has many layers. There is no single PDF size limit.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Hard limit?&lt;/th&gt;
&lt;th&gt;What actually matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;PyPDFLoader&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;RAM. The whole PDF gets loaded into memory before chunking.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RecursiveCharacterTextSplitter&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;None. It just splits whatever you give it.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI embeddings API&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Tokens per input + rate limits per minute + your wallet.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Qdrant&lt;/td&gt;
&lt;td&gt;Practically no&lt;/td&gt;
&lt;td&gt;Designed for millions of vectors.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Your laptop&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;RAM, disk, and patience.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The actual bottleneck is almost always OpenAI, not Python or Qdrant.&lt;/p&gt;

&lt;h3&gt;
  
  
  What OpenAI actually limits
&lt;/h3&gt;

&lt;p&gt;There are three things to watch on the embeddings API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Max tokens per single input — 8,191&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each chunk you send to be embedded can be at most 8,191 tokens, which is roughly 32,000 characters. My &lt;code&gt;chunk_size=1000&lt;/code&gt; is way below that, so this is never hit in normal RAG. Official source: &lt;a href="https://platform.openai.com/docs/guides/embeddings" rel="noopener noreferrer"&gt;https://platform.openai.com/docs/guides/embeddings&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Rate limits — RPM (requests per minute) and TPM (tokens per minute)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;These depend on your account tier. Tier 1, where most new accounts start, looks like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;RPM&lt;/th&gt;
&lt;th&gt;TPM&lt;/th&gt;
&lt;th&gt;Batch queue limit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;40,000&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tier 1&lt;/td&gt;
&lt;td&gt;3,000&lt;/td&gt;
&lt;td&gt;1,000,000&lt;/td&gt;
&lt;td&gt;3,000,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tier 2&lt;/td&gt;
&lt;td&gt;5,000&lt;/td&gt;
&lt;td&gt;1,000,000&lt;/td&gt;
&lt;td&gt;20,000,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tier 3&lt;/td&gt;
&lt;td&gt;5,000&lt;/td&gt;
&lt;td&gt;5,000,000&lt;/td&gt;
&lt;td&gt;100,000,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tier 4&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;5,000,000&lt;/td&gt;
&lt;td&gt;500,000,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tier 5&lt;/td&gt;
&lt;td&gt;10,000&lt;/td&gt;
&lt;td&gt;10,000,000&lt;/td&gt;
&lt;td&gt;4,000,000,000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Your tier moves up automatically as you spend more on the platform.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Official rate limit docs: &lt;a href="https://platform.openai.com/docs/guides/rate-limits/usage-tiers" rel="noopener noreferrer"&gt;https://platform.openai.com/docs/guides/rate-limits/usage-tiers&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Your own account's limits: &lt;a href="https://platform.openai.com/account/limits" rel="noopener noreferrer"&gt;https://platform.openai.com/account/limits&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you hit the TPM, LangChain backs off and retries automatically. So your script doesn't crash — it just takes longer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. The cost — $0.02 per 1M tokens for &lt;code&gt;text-embedding-3-small&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the most underrated number. A 100 MB text PDF is roughly 20–50 million tokens. That works out to about $0.40 to $1.00 in embedding cost. Real-world cheap. Pricing page: &lt;a href="https://platform.openai.com/docs/pricing" rel="noopener noreferrer"&gt;https://platform.openai.com/docs/pricing&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>softwareengineering</category>
      <category>rag</category>
      <category>backenddevelopment</category>
    </item>
    <item>
      <title>Day 3: Prompting Techniques in AI</title>
      <dc:creator>Suraj Bera</dc:creator>
      <pubDate>Tue, 19 May 2026 18:25:53 +0000</pubDate>
      <link>https://dev.to/suraj_bera/day-3-prompting-techniques-in-ai-43ac</link>
      <guid>https://dev.to/suraj_bera/day-3-prompting-techniques-in-ai-43ac</guid>
      <description>&lt;p&gt;AI Prompting techniques: Zero-shot, One-shot, Few-shot&lt;/p&gt;

&lt;p&gt;After using the ChatGPT and other AI tools, I used to think prompts were just simple text inputs that AI models magically processed. But as mentioned in my Day 1 post: AI models are just &lt;strong&gt;next-word predictors&lt;/strong&gt;, not thinkers. They predict based on training data (though modern ones now use real-time search and tool calling for better results).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Basics&lt;/strong&gt;&lt;br&gt;
Simplest python code snippet for getting response from AI&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;responses&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-5-nano&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;How do I reverse a list in Python?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;most important&lt;/strong&gt; point to note: role key has 3 values: &lt;code&gt;system&lt;/code&gt;, &lt;code&gt;user&lt;/code&gt;, &lt;code&gt;assistant&lt;/code&gt;. The system prompt sets the behaviour.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Zero-shot prompting&lt;/strong&gt;&lt;br&gt;
No examples -- just instructions in the system prompt&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
You are a helpful assistant that only answers Python programming questions.
If the user asks about anything else, politely decline.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. One-shot prompting&lt;/strong&gt;&lt;br&gt;
Provide one example how model should respond.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the capital of France?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m sorry, I can only help with Python programming questions.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;How do I reverse a list in Python?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Few-shot prompting&lt;/strong&gt;&lt;br&gt;
Provide multiple examples so the model responds more precisely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;How to code a binary tree in Python?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Sure, here&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s an implementation...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the weather today?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m sorry, I can only help with Python programming questions.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is the capital of France?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m sorry, I can only help with Python programming questions.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Why is 75% attendance required for the exam?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So as a beginner trying to get into AI engineering, you need to shift your mindset from just chatting with AI mindlessly to &lt;code&gt;designing system prompt for your own AI product&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Complete code for &lt;a href="https://gist.github.com/surajbera/cd0e8a9b0d5bbf36c48a1eb8af132662" rel="noopener noreferrer"&gt;few-shot-prompting&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Links to &lt;a href="https://www.linkedin.com/posts/surajbera_tiktokenizer-activity-7456386292990304256-HgzV?utm_source=social_share_send&amp;amp;utm_medium=member_desktop_web&amp;amp;rcm=ACoAACqk0GoBZnxif8bFvArYATVbMOZugoVL0Ms" rel="noopener noreferrer"&gt;day 1&lt;/a&gt; and &lt;a href="https://dev.to/suraj_bera/part-2-vector-embeddings-in-simplest-terms-2j35"&gt;day 2&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Part 1: Learning Docker for beginners</title>
      <dc:creator>Suraj Bera</dc:creator>
      <pubDate>Mon, 11 May 2026 11:11:11 +0000</pubDate>
      <link>https://dev.to/suraj_bera/part-1-learning-docker-for-beginners-53nb</link>
      <guid>https://dev.to/suraj_bera/part-1-learning-docker-for-beginners-53nb</guid>
      <description>&lt;h2&gt;
  
  
  What is a Dockerfile?
&lt;/h2&gt;

&lt;p&gt;It contains instruction that docker uses to package up an application into an image.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Containers&lt;/strong&gt; and &lt;strong&gt;Images&lt;/strong&gt;, What are these?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Images&lt;/strong&gt;: An image includes everything an application needs to run. It includes application files, environment variables and so on.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Container&lt;/strong&gt;: Once we have an image, we can start a container from it. A container is like a VM in a sense that it provides an isolated environment for executing an application. Similar to VM we can stop and restart a container.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Simplified Analogy: Think of images as a class(blueprint) and containers as running instance(objects) for that blueprint. &lt;/p&gt;

&lt;h2&gt;
  
  
  Steps to create a Dockerfile:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Choose a correct base image&lt;/li&gt;
&lt;li&gt;Setup the working directory&lt;/li&gt;
&lt;li&gt;Copy application files into the image. Example: &lt;code&gt;COPY &amp;lt;source&amp;gt; &amp;lt;destination&amp;gt;&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Exclude files &amp;amp; folders(node_modules, venv, etc)&lt;/li&gt;
&lt;li&gt;Adding environment variables&lt;/li&gt;
&lt;li&gt;Exposing ports&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;By default docker runs an application with the root user that has the highest privileges.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Diff. between &lt;strong&gt;CMD&lt;/strong&gt; and &lt;strong&gt;RUN&lt;/strong&gt; instructions:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;With both these we can execute instructions. The RUN instruction is a build time instruction. This is executed at the time of building an image. Example: &lt;code&gt;npm install&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The CMD instruction is a runtime instruction, it's executed when starting a container.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>devops</category>
      <category>docker</category>
      <category>softwaredevelopment</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Day 4: ReAct - Reasoning + Acting upon(Prompting Technique)</title>
      <dc:creator>Suraj Bera</dc:creator>
      <pubDate>Mon, 04 May 2026 15:04:12 +0000</pubDate>
      <link>https://dev.to/suraj_bera/day-4-react-reasoning-acting-uponprompting-technique-3c2j</link>
      <guid>https://dev.to/suraj_bera/day-4-react-reasoning-acting-uponprompting-technique-3c2j</guid>
      <description>&lt;h2&gt;
  
  
  Big picture of ReAct:
&lt;/h2&gt;

&lt;p&gt;AI doesn't just answer immediately, it goes through stages.&lt;br&gt;
think -&amp;gt; plan -&amp;gt; action -&amp;gt; observe -&amp;gt; output(total 5 steps)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ReAct&lt;/strong&gt; = Forcing the AI to show its work step by step before arriving at a conclusion, just like how a student arrives to the solution of a math problem step by step.&lt;/p&gt;

&lt;p&gt;Here is the simplest mental model for ReAct:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
┌──────────────────────────────┐   ┌──────────────────────────────┐   ┌──────────────────────────────┐
│ Current State / User Question│ → │ Reason about current state   │ → │ Need more action?            │
└──────────────────────────────┘   └──────────────────────────────┘   └──────────────┬───────────────┘
                                                                                       │
                                                                           ┌───────────┴───────────┐
                                                                           │                       │
                                                                          Yes                      No
                                                                           │                       │
                                                                           v                       v
                                                ┌──────────────────────────────┐   ┌──────────────────────────────┐
                                                │ Choose tool / perform action │   │ Return final answer          │
                                                └──────────────┬───────────────┘   └──────────────────────────────┘
                                                               │
                                                               v
                                                ┌──────────────────────────────┐
                                                │ Observe result               │
                                                └──────────────┬───────────────┘
                                                               │
                                                               └──── back to "Reason about current state"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Comparing Chain of Thought(another prompting technique) vs ReAct:&lt;br&gt;
&lt;strong&gt;CoT&lt;/strong&gt; - In this technique, the LLM is instructed to generate its intermediate reasoning steps as part of the output, instead of jumping straight to the final answer.&lt;br&gt;
&lt;strong&gt;ReAct&lt;/strong&gt; - CoT only thinks, ReAct thinks and does stuffs.&lt;/p&gt;

&lt;p&gt;Here is the &lt;a href="https://gist.github.com/surajbera/9dbea0b77413f5d213791591b7297727" rel="noopener noreferrer"&gt;code&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Links to &lt;a href="https://dev.to/suraj_bera/day-3-prompting-techniques-in-ai-43ac"&gt;Day 3&lt;/a&gt; and &lt;a href="https://dev.to/suraj_bera/part-2-vector-embeddings-in-simplest-terms-2j35"&gt;Day 2&lt;/a&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>llm</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Day 2: Vector Embeddings in simplest terms</title>
      <dc:creator>Suraj Bera</dc:creator>
      <pubDate>Sun, 03 May 2026 05:03:53 +0000</pubDate>
      <link>https://dev.to/suraj_bera/part-2-vector-embeddings-in-simplest-terms-2j35</link>
      <guid>https://dev.to/suraj_bera/part-2-vector-embeddings-in-simplest-terms-2j35</guid>
      <description>&lt;p&gt;This is my Day 2 of learning AI fundamentals where I will be covering  the following concepts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Vector Embeddings&lt;/li&gt;
&lt;li&gt;How Tokenisation and Vector Embeddings relate to each other&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Vector embeddings:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Vector embeddings is the process of turning each token id(generated during tokenisation) into high dimensional vector where semantic similarity results into geometric closeness. Think of it like this: dog is closer to puppy, also closer to dog food. But dog is not closer to car or petrol.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  When we use embeddings?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Recommendations: Suggest similar songs, videos, movies, products&lt;/li&gt;
&lt;li&gt;Search: Get search results when keywords don't match&lt;/li&gt;
&lt;li&gt;Cluster: Grouping related concepts together.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A beginner might be confused in terms like: Vector, High Dimensional.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This is an example of a vector: [0.9, 0.8, 0.1]. Array/List/Vector all mean the same thing. 'List' is just a plain english, 'array' is the programming term, vector is the math/ml term.&lt;/li&gt;
&lt;li&gt;High Dimensional: Multi-dimensional just means more than 1 - could be 2D, 3D, 10D,... too vague. But high dimensional specifically means  &lt;strong&gt;hundreds of thousands of dimensions&lt;/strong&gt;(Open AI's text-embedding-3-small = 1536 dimensions)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are 2 types of Search:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Lexical Search: Exact words/characters search&lt;/li&gt;
&lt;li&gt;Semantic Search: Meaning/Intent search&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;Vector embedding enables semantic search&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  How tokenisation &amp;amp; vector embeddings are connected together?
&lt;/h2&gt;

&lt;p&gt;text-tokenisation --&amp;gt; token Ids-embedding lookup --&amp;gt; vectors --&amp;gt; transformers&lt;br&gt;
"hello"             --&amp;gt; [221728]               --&amp;gt; [0.21,-0.44,...]&lt;/p&gt;

&lt;p&gt;&lt;a href="https://projector.tensorflow.org/" rel="noopener noreferrer"&gt;Vector Embedding Visualizer&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here is the link to &lt;a href="https://www.linkedin.com/posts/surajbera_tiktokenizer-activity-7456386292990304256-HgzV?utm_source=share&amp;amp;utm_medium=member_desktop&amp;amp;rcm=ACoAACqk0GoBZnxif8bFvArYATVbMOZugoVL0Ms" rel="noopener noreferrer"&gt;part 1&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>beginners</category>
      <category>machinelearning</category>
      <category>nlp</category>
    </item>
  </channel>
</rss>
