<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sanjay Mishra</title>
    <description>The latest articles on DEV Community by Sanjay Mishra (@sanmish4).</description>
    <link>https://dev.to/sanmish4</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3785485%2Fa64f7db6-1311-4e43-8e9e-4bd72ac52a13.png</url>
      <title>DEV Community: Sanjay Mishra</title>
      <link>https://dev.to/sanmish4</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sanmish4"/>
    <language>en</language>
    <item>
      <title>Every AI Buzzword Explained With Analogy of Learning to Drive</title>
      <dc:creator>Sanjay Mishra</dc:creator>
      <pubDate>Mon, 02 Mar 2026 06:33:38 +0000</pubDate>
      <link>https://dev.to/sanmish4/every-ai-buzzword-explained-like-youre-learning-to-drive-3809</link>
      <guid>https://dev.to/sanmish4/every-ai-buzzword-explained-like-youre-learning-to-drive-3809</guid>
      <description>&lt;p&gt;You’ve seen the words everywhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM. RAG. Fine-Tuning. Agentic AI. MCP. Embeddings. Temperature.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every article assumes you already know what they mean. Every explainer swaps one jargon word for three others.&lt;/p&gt;

&lt;p&gt;So let’s try something different. One analogy. Stick with it all the way through. By the end, you’ll have a clean mental model for every major AI buzzword — and you’ll actually remember it.&lt;/p&gt;

&lt;p&gt;The analogy: &lt;strong&gt;a person learning to drive a car.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is AI?
&lt;/h2&gt;

&lt;p&gt;AI is like a person learning to drive.&lt;/p&gt;

&lt;p&gt;At first, they know absolutely nothing. Put them behind the wheel and they’ll stare blankly.&lt;/p&gt;

&lt;p&gt;But after watching and practicing — thousands of times — they start learning &lt;em&gt;patterns&lt;/em&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔴 Red light → Stop&lt;/li&gt;
&lt;li&gt;🟢 Green light → Go&lt;/li&gt;
&lt;li&gt;↩️ Turn signal on the car ahead → It may be turning&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI works the same way. Feed it enough examples, and it learns to recognise patterns. Then it uses those patterns to predict what comes next.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The critical thing to understand:&lt;/strong&gt; AI does &lt;em&gt;not&lt;/em&gt; think like a human. It doesn’t understand. It predicts — based on what it’s seen before. The more examples, the better the predictions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Small Model vs LLM
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Small Model = a new driver fresh out of driving school.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;They can handle the basics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;They know the pedals&lt;/li&gt;
&lt;li&gt;They can follow clear road signs&lt;/li&gt;
&lt;li&gt;They drive slowly and carefully&lt;/li&gt;
&lt;li&gt;Complex situations? Forget it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;LLM (Large Language Model) = a driver who’s been on the road for years.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This person has driven through:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Busy cities at rush hour&lt;/li&gt;
&lt;li&gt;Motorways at 3am&lt;/li&gt;
&lt;li&gt;Torrential rain&lt;/li&gt;
&lt;li&gt;Road rage incidents, construction zones, every edge case imaginable&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They can handle almost anything you throw at them.&lt;/p&gt;

&lt;p&gt;“Large” just means the model learned from a &lt;em&gt;massive&lt;/em&gt; number of examples — billions of texts, not thousands.&lt;/p&gt;

&lt;p&gt;The big LLMs you hear about? Each company has trained their own experienced driver:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Company&lt;/th&gt;
&lt;th&gt;Their “Driver”&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;GPT-4, o3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;Gemini&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Meta&lt;/td&gt;
&lt;td&gt;Llama&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;td&gt;Claude&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;They compete on one question: &lt;em&gt;whose driver handles novel situations better?&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Prompt Engineering
&lt;/h2&gt;

&lt;p&gt;You get in the car and say:&lt;/p&gt;

&lt;p&gt;“Drive.”&lt;/p&gt;

&lt;p&gt;The driver stares at you. Drive &lt;em&gt;where&lt;/em&gt;? How fast? Any stops?&lt;/p&gt;

&lt;p&gt;Now you say:&lt;/p&gt;

&lt;p&gt;“Take me to the airport. Fastest route. Avoid the motorway if there are delays.”&lt;/p&gt;

&lt;p&gt;Now the driver knows exactly what to do.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A prompt is the instruction you give the AI.&lt;/strong&gt; The model doesn’t change. Your instruction changes — and it changes everything.&lt;/p&gt;

&lt;p&gt;Prompt engineering is the skill of giving great driving directions. Being specific. Giving context. Saying what you &lt;em&gt;don’t&lt;/em&gt; want as well as what you do.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; Bad prompt:  "Write something about our product."

 Good prompt: "Write a 3-paragraph product description for a B2B SaaS tool
               targeting HR managers. Tone: professional but warm. Focus on
               time savings. End with a soft CTA."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same model. Wildly different outputs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Training
&lt;/h2&gt;

&lt;p&gt;Before your driver became experienced, they had to practice. A lot.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;City driving&lt;/li&gt;
&lt;li&gt;Highway driving&lt;/li&gt;
&lt;li&gt;Night driving&lt;/li&gt;
&lt;li&gt;Parking&lt;/li&gt;
&lt;li&gt;Roundabouts (the nemesis of every new driver)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thousands of hours across thousands of different situations.&lt;/p&gt;

&lt;p&gt;That practice is called &lt;strong&gt;training&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;AI training works identically: you feed the model enormous amounts of examples, it adjusts its internal parameters after each one, and slowly — over months of compute time — it gets better.&lt;/p&gt;

&lt;p&gt;Training a frontier model costs tens of millions of dollars. It’s done &lt;em&gt;once&lt;/em&gt; (per model version) by the big labs. You and I don’t do training. We use the trained model.&lt;/p&gt;




&lt;h2&gt;
  
  
  Fine-Tuning
&lt;/h2&gt;

&lt;p&gt;Your driver is already great at general driving.&lt;/p&gt;

&lt;p&gt;But now you need something specific. You need a &lt;strong&gt;race car driver&lt;/strong&gt;. Or a &lt;strong&gt;lorry driver&lt;/strong&gt; who handles 40-tonne vehicles. Or an &lt;strong&gt;off-road specialist&lt;/strong&gt; for rough terrain.&lt;/p&gt;

&lt;p&gt;You send them on a specialist course. They come back permanently improved — not just in general driving, but in that &lt;em&gt;specific&lt;/em&gt; skill.&lt;/p&gt;

&lt;p&gt;That’s &lt;strong&gt;fine-tuning&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;You take an existing model (the experienced driver) and train it further on your specific data — your company’s tone, your industry’s terminology, your support tickets, your legal documents.&lt;/p&gt;

&lt;p&gt;The knowledge is baked in permanently. It’s not looking anything up. It &lt;em&gt;is&lt;/em&gt; the specialised driver now.&lt;/p&gt;

&lt;p&gt;Fine-tuning is more expensive than alternatives, but the specialisation is deep.&lt;/p&gt;




&lt;h2&gt;
  
  
  RAG — Retrieval-Augmented Generation
&lt;/h2&gt;

&lt;p&gt;Your passenger asks:&lt;/p&gt;

&lt;p&gt;“How long will it take to get to the airport?”&lt;/p&gt;

&lt;p&gt;A bad driver guesses. Maybe confidently. Maybe wrongly.&lt;/p&gt;

&lt;p&gt;A smart driver does this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Opens Google Maps&lt;/li&gt;
&lt;li&gt;Checks live traffic&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Then&lt;/em&gt; answers&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That’s &lt;strong&gt;RAG&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of relying purely on what the AI learned during training, RAG first &lt;em&gt;retrieves&lt;/em&gt; relevant documents — your company wiki, your database, today’s news, a knowledge base — then feeds them to the model as context before it answers.&lt;/p&gt;

&lt;p&gt;The result: answers grounded in real, current information. Not guesses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fine-Tuning vs RAG — a common confusion:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Fine-Tuning&lt;/th&gt;
&lt;th&gt;RAG&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Analogy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Permanent specialist training&lt;/td&gt;
&lt;td&gt;Checking GPS each time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Knowledge&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Baked in&lt;/td&gt;
&lt;td&gt;Fetched live&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Good for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Style, tone, domain expertise&lt;/td&gt;
&lt;td&gt;Up-to-date facts, large document stores&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Higher upfront&lt;/td&gt;
&lt;td&gt;Per-query retrieval cost&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Many production systems use &lt;em&gt;both&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Embeddings &amp;amp; Vector Databases
&lt;/h2&gt;

&lt;p&gt;An experienced driver has a mental map of the city.&lt;/p&gt;

&lt;p&gt;They know:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The airport road connects to the motorway&lt;/li&gt;
&lt;li&gt;The school zone is in the residential area near the park&lt;/li&gt;
&lt;li&gt;Two roads that look different on a map actually connect the same neighbourhoods&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn’t alphabetical knowledge (“A roads, B roads…”). It’s &lt;em&gt;relational&lt;/em&gt; knowledge — understanding which places are &lt;em&gt;semantically close&lt;/em&gt; to each other.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Embeddings&lt;/strong&gt; are exactly this for AI.&lt;/p&gt;

&lt;p&gt;Text is converted into lists of numbers (vectors) that capture &lt;em&gt;meaning&lt;/em&gt;. “Dog” and “puppy” end up with similar numbers. “Paris” and “Eiffel Tower” are close. “Invoice” and “bill” cluster together.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;vector database&lt;/strong&gt; stores all these number-maps so the AI can instantly find &lt;em&gt;semantically similar&lt;/em&gt; content — not just keyword matches. This is the engine that powers RAG. When you ask a question, your question is turned into a vector, and the database finds documents with similar vectors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Search query: "how do I cancel my subscription"
Finds docs about: "ending your plan", "account termination", "billing cancellation"
Even though none of them used the exact words "cancel my subscription"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Temperature
&lt;/h2&gt;

&lt;p&gt;Same driver. Same route. Two very different personalities depending on the day.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Low temperature (0.1):&lt;/strong&gt; Safe, predictable, follows the rules strictly. Always takes the same well-known route. Never improvises.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;High temperature (1.0):&lt;/strong&gt; Takes creative shortcuts. Tries new routes you’ve never heard of. Sometimes brilliant. Occasionally ends up in someone’s garden.&lt;/p&gt;

&lt;p&gt;This is literally a parameter you can set when calling an AI model. It controls how &lt;em&gt;random&lt;/em&gt; the model’s outputs are.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Precise, factual tasks — use low temperature
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Predictable, consistent
&lt;/span&gt;   &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s the capital of France?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Creative writing — use higher temperature
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-5&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# More varied, creative
&lt;/span&gt;   &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write me a poem about autumn.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Rule of thumb:&lt;/strong&gt; Low temperature for facts. High temperature for creativity.&lt;/p&gt;




&lt;h2&gt;
  
  
  Agentic AI
&lt;/h2&gt;

&lt;p&gt;Normal AI: you tell the driver “turn left,” “now right,” “stop here.” Every single step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agentic AI:&lt;/strong&gt; you say “take care of my entire journey this week” and go back to reading your book.&lt;/p&gt;

&lt;p&gt;The agent driver:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Plans the routes themselves&lt;/li&gt;
&lt;li&gt;Checks fuel and refuels when needed&lt;/li&gt;
&lt;li&gt;Adjusts dynamically for live traffic&lt;/li&gt;
&lt;li&gt;Finds and pays for parking&lt;/li&gt;
&lt;li&gt;Notifies you when each stage is done&lt;/li&gt;
&lt;li&gt;Handles unexpected problems without calling you&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You gave it a &lt;em&gt;goal&lt;/em&gt;. Not a list of instructions.&lt;/p&gt;

&lt;p&gt;This is the direction the whole industry is racing towards. Instead of answering a single question, agentic AI systems can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Browse the web to research a topic&lt;/li&gt;
&lt;li&gt;Write and execute code&lt;/li&gt;
&lt;li&gt;Send emails and book meetings&lt;/li&gt;
&lt;li&gt;Read and update databases&lt;/li&gt;
&lt;li&gt;Complete multi-step workflows end-to-end&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key difference: &lt;strong&gt;normal AI responds. Agentic AI acts.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Tools
&lt;/h2&gt;

&lt;p&gt;Even the best driver is limited without equipment.&lt;/p&gt;

&lt;p&gt;With just their skills: good, but constrained.&lt;/p&gt;

&lt;p&gt;With tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; GPS — knows exactly where to go&lt;/li&gt;
&lt;li&gt; Fuel gauge — knows when to stop&lt;/li&gt;
&lt;li&gt; Traffic alerts — knows what to avoid&lt;/li&gt;
&lt;li&gt; Payment app — can pay tolls and parking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tools multiply what the driver can do.&lt;/p&gt;

&lt;p&gt;AI tools work identically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Web search&lt;/strong&gt; — access real-time information&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code interpreter&lt;/strong&gt; — write and run actual code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database access&lt;/strong&gt; — read and write data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;APIs&lt;/strong&gt; — interact with external services&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A model &lt;em&gt;with&lt;/em&gt; tools can do things a model &lt;em&gt;without&lt;/em&gt; tools simply cannot. It’s not smarter — it’s better equipped.&lt;/p&gt;




&lt;h2&gt;
  
  
  MCP — Model Context Protocol
&lt;/h2&gt;

&lt;p&gt;Your driver needs to use GPS, the fuel system, the traffic app, and the payment system.&lt;/p&gt;

&lt;p&gt;Old approach: custom wiring for every combination. GPS talks to the car one way. Traffic app talks a different way. Payment system yet another way. A mess of proprietary connections.&lt;/p&gt;

&lt;p&gt;New approach: &lt;strong&gt;one standard connection port&lt;/strong&gt;. Every system plugs in the same way. Add a new tool? Plug it in. Works instantly.&lt;/p&gt;

&lt;p&gt;That’s &lt;strong&gt;MCP&lt;/strong&gt; (Model Context Protocol) — an open standard developed by Anthropic for how AI models connect to external tools and data sources.&lt;/p&gt;

&lt;p&gt;Before MCP: every developer built custom integrations between their AI and their tools. Hundreds of different “wiring” approaches, incompatible with each other.&lt;/p&gt;

&lt;p&gt;After MCP: build one MCP server, and &lt;em&gt;any&lt;/em&gt; compatible AI can use it. Standardised. Composable. Interoperable.&lt;/p&gt;

&lt;p&gt;It’s the USB-C of AI integrations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Hallucination
&lt;/h2&gt;

&lt;p&gt;The driver confidently tells you:&lt;/p&gt;

&lt;p&gt;“Oh yes, I know this road well. My grandfather drove this route for 40 years. There’s a great shortcut just ahead.”&lt;/p&gt;

&lt;p&gt;There is no grandfather. There is no shortcut. The driver made it up — with complete confidence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hallucination&lt;/strong&gt; is when AI generates false information that sounds completely plausible.&lt;/p&gt;

&lt;p&gt;It’s not lying. It has no concept of lying. It’s generating the most &lt;em&gt;statistically likely&lt;/em&gt; continuation of the conversation — and sometimes that means inventing facts that sound right but aren’t.&lt;/p&gt;

&lt;p&gt;RAG and grounding techniques reduce hallucinations by anchoring the model to real documents. But they don’t eliminate the problem entirely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The rule:&lt;/strong&gt; always verify important AI outputs. Especially dates, statistics, citations, and anything that would be embarrassing if wrong.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tokens
&lt;/h2&gt;

&lt;p&gt;The taxi meter doesn’t charge by journey. It charges by distance — every fraction of a mile.&lt;/p&gt;

&lt;p&gt;AI doesn’t process words. It processes &lt;strong&gt;tokens&lt;/strong&gt; — chunks of characters, roughly ¾ of a word each.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"Hello, world!"   → ~4 tokens
"The quick brown fox" → ~5 tokens
"Retrieval-Augmented Generation" → ~5 tokens
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why does it matter?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pricing&lt;/strong&gt; is per token (both input and output)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context limits&lt;/strong&gt; are measured in tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Speed&lt;/strong&gt; depends on token count&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every API call to an AI model is metered. The more you send, and the more the model generates back, the more tokens you consume — and the more you pay.&lt;/p&gt;




&lt;h2&gt;
  
  
  Context Window
&lt;/h2&gt;

&lt;p&gt;The driver can only hold so many things in their working memory at once.&lt;/p&gt;

&lt;p&gt;If you brief them on 10 things before the journey, they’ll probably remember all 10. Brief them on 200 things and they’ll start forgetting the early ones by the time you hit the road.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;context window&lt;/strong&gt; is how much the AI can “see” and “remember” at once — the conversation history, any documents you’ve provided, the system prompt, your new message. All of it has to fit.&lt;/p&gt;

&lt;p&gt;Modern models have huge context windows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GPT-4o: ~128K tokens (~96,000 words)&lt;/li&gt;
&lt;li&gt;Claude: up to 200K tokens (~150,000 words)&lt;/li&gt;
&lt;li&gt;Gemini 1.5 Pro: up to 1M tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Beyond the window? The model simply doesn’t know it. Old messages fall off the edge.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Complete Cheat Sheet
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;The Driving Analogy&lt;/th&gt;
&lt;th&gt;Plain English&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The whole concept of a driver learning&lt;/td&gt;
&lt;td&gt;Machines doing smart things&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Small Model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;New learner driver&lt;/td&gt;
&lt;td&gt;Basic skills, limited ability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LLM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Experienced driver who’s seen everything&lt;/td&gt;
&lt;td&gt;Trained on billions of examples&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Prompt&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Your driving directions&lt;/td&gt;
&lt;td&gt;Instruction you give the AI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Prompt Engineering&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Learning to give great directions&lt;/td&gt;
&lt;td&gt;Crafting inputs for better outputs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Training&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Driving school + years of practice&lt;/td&gt;
&lt;td&gt;Teaching the model from examples&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fine-Tuning&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Specialist driving course&lt;/td&gt;
&lt;td&gt;Adapting a model to a specific domain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RAG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Driver checks GPS before answering&lt;/td&gt;
&lt;td&gt;Retrieves real docs before responding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Embeddings&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Road map in the driver’s head&lt;/td&gt;
&lt;td&gt;Text stored by semantic similarity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Vector DB&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The map library&lt;/td&gt;
&lt;td&gt;Database searchable by meaning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Temperature&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cautious vs. creative driving style&lt;/td&gt;
&lt;td&gt;Controls randomness in outputs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agentic AI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Driver who runs the whole trip solo&lt;/td&gt;
&lt;td&gt;AI that plans and acts autonomously&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tools&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GPS, fuel gauge, payment app&lt;/td&gt;
&lt;td&gt;Web search, APIs, code execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCP&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Universal car connection port&lt;/td&gt;
&lt;td&gt;Standard protocol for AI ↔ tools&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hallucination&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Driver invents a non-existent shortcut&lt;/td&gt;
&lt;td&gt;AI confidently states false things&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Token&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Individual meter ticks&lt;/td&gt;
&lt;td&gt;Unit of text AI processes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context Window&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Driver’s working memory&lt;/td&gt;
&lt;td&gt;How much AI can see at once&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;These aren’t separate things. They stack.&lt;/p&gt;

&lt;p&gt;A production AI system in 2025 typically looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You write a &lt;strong&gt;prompt&lt;/strong&gt; (clear directions)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RAG&lt;/strong&gt; retrieves relevant documents (driver checks GPS)&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;fine-tuned LLM&lt;/strong&gt; reads everything (specialist driver applies training)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embeddings&lt;/strong&gt; powered the retrieval (mental map found related content)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic&lt;/strong&gt; behavior kicks in if action is needed (driver handles the trip)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools&lt;/strong&gt; via &lt;strong&gt;MCP&lt;/strong&gt; connect it to the real world (GPS + payment + everything else)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every buzzword is one piece of the same vehicle.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If this helped, share it with whoever is still nodding along blankly when their CTO says “we’re building a RAG pipeline with an agentic layer.” They’ll thank you.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>llm</category>
      <category>ai</category>
      <category>rag</category>
      <category>agentaichallenge</category>
    </item>
    <item>
      <title>Unbounded Data Fetching: A Silent Performance Anti-Pattern in API and Database Layers</title>
      <dc:creator>Sanjay Mishra</dc:creator>
      <pubDate>Sun, 22 Feb 2026 20:50:40 +0000</pubDate>
      <link>https://dev.to/sanmish4/unbounded-data-fetching-a-silent-performance-anti-pattern-in-api-and-database-layers-1dnk</link>
      <guid>https://dev.to/sanmish4/unbounded-data-fetching-a-silent-performance-anti-pattern-in-api-and-database-layers-1dnk</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;One of the most pervasive and costly performance anti-patterns in backend development is unbounded data fetching — querying the database for an entire result set when only a fraction of that data is needed by the caller. This pattern is deceptively simple to introduce, difficult to detect in development environments with limited data, and expensive in production systems operating at scale.&lt;/p&gt;

&lt;p&gt;This article examines where unbounded fetching occurs, why it degrades performance across the full request lifecycle, and how to eliminate it at each layer of the stack — from SQL queries to ORM abstractions to API contract design.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is Unbounded Fetching?
&lt;/h2&gt;

&lt;p&gt;Unbounded fetching occurs when an application retrieves more data from a data source than it intends to use or return. The most common manifestation is a query with no &lt;code&gt;LIMIT&lt;/code&gt;, &lt;code&gt;TOP&lt;/code&gt;, &lt;code&gt;ROWNUM&lt;/code&gt;, or equivalent constraint — combined with application-side filtering or slicing after the result set has already been transferred.&lt;/p&gt;

&lt;p&gt;Consider the following example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python + cx_Oracle
&lt;/span&gt;&lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM orders&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cursor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchall&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output of this code — 10 rows — is identical to a query that fetches only 10 rows. The cost is not. The database must evaluate and serialize the full result set, the network must carry it, the driver must deserialize it, and the application server must allocate memory to hold it. Every step in this pipeline scales with the number of rows returned by the query, not the number of rows ultimately used.&lt;/p&gt;

&lt;p&gt;The same pattern appears in other forms:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Application-side filtering after full fetch
&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM users WHERE active = true&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;admins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;admin&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# ORM equivalent
&lt;/span&gt;&lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;objects&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;recent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Node.js / Sequelize
&lt;/span&gt;&lt;span class="n"&gt;const&lt;/span&gt; &lt;span class="n"&gt;records&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findAll&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In each case, the constraint is applied in application code rather than being pushed down to the database where it belongs.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Performance Cost Across Layers
&lt;/h2&gt;

&lt;p&gt;Understanding why unbounded fetching is expensive requires examining what happens at each layer of the data path.&lt;/p&gt;

&lt;h3&gt;
  
  
  Database Layer
&lt;/h3&gt;

&lt;p&gt;Without a row-limiting clause, the database query optimizer cannot use early-exit optimizations. A full table scan or index scan is executed to completion. For a table with 700,000 rows and 9 columns, this means evaluating and preparing all 700,000 rows for transfer regardless of how many the application will use.&lt;/p&gt;

&lt;p&gt;Oracle's &lt;code&gt;ROWNUM&lt;/code&gt; pseudo-column, for example, causes the execution plan to terminate row evaluation as soon as the limit is reached. Without it, the plan has no such signal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Network Layer
&lt;/h3&gt;

&lt;p&gt;Serialized row data must travel from the database server to the application server. On a local network, this is fast but not free. In a cloud architecture where the database and application tier are separated — even within the same region — data transfer incurs latency proportional to payload size, and in some configurations, measurable egress costs.&lt;/p&gt;

&lt;p&gt;A table with 700,000 rows of average row size 200 bytes represents approximately 140 MB of raw data per unbounded query. Under moderate concurrency, this becomes a significant bandwidth consumer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Application Layer
&lt;/h3&gt;

&lt;p&gt;The database driver deserializes the wire-format response into language-level objects — Python tuples, Java POJOs, JavaScript objects. Memory is allocated for the full result set before any application logic executes. For Python's &lt;code&gt;cx_Oracle&lt;/code&gt; or &lt;code&gt;psycopg2&lt;/code&gt;, a 700,000-row result set can consume several hundred megabytes of heap memory per request. Under concurrent load, this accelerates memory pressure and can trigger garbage collection cycles that further degrade throughput.&lt;/p&gt;

&lt;p&gt;The combined effect across layers produces response times that are orders of magnitude slower than equivalent bounded queries — not because the application logic is slow, but because the data pipeline is doing unnecessary work at every stage.&lt;/p&gt;




&lt;h2&gt;
  
  
  Correcting the Pattern at the SQL Layer
&lt;/h2&gt;

&lt;p&gt;The fix must be applied at the source — in the SQL query itself. Each major database platform provides a mechanism for row limiting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Oracle (legacy)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ROWNUM&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Oracle 12c+ / ANSI SQL standard&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;FETCH&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;ONLY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- PostgreSQL / MySQL / SQLite&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- SQL Server&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;TOP&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For filtered queries, the constraint should be combined with appropriate indexing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- With filtering and ordering&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'PENDING'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;FETCH&lt;/span&gt; &lt;span class="k"&gt;FIRST&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt; &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;ONLY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach allows the database query optimizer to select an execution plan that respects the row limit. In many cases, this enables index-based access paths that avoid full table scans entirely.&lt;/p&gt;




&lt;h2&gt;
  
  
  Correcting the Pattern at the ORM Layer
&lt;/h2&gt;

&lt;p&gt;ORMs introduce an additional risk: their abstractions can obscure the SQL being generated, making it easy to write code that looks harmless but produces expensive unbounded queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQLAlchemy (Python):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Unbounded — generates SELECT * FROM orders with no LIMIT
&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;()[:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Bounded — generates SELECT * FROM orders LIMIT 10
&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# With filtering
&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;PENDING&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;order_by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Django ORM (Python):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Unbounded — fetches all records into memory before slicing
&lt;/span&gt;&lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;objects&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;())[:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# list() forces evaluation
&lt;/span&gt;
&lt;span class="c1"&gt;# Bounded — Django QuerySet slicing translates to LIMIT in SQL
&lt;/span&gt;&lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;objects&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;()[:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# Correct pattern with ordering
&lt;/span&gt;&lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;objects&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;PENDING&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;order_by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-created_at&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[:&lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: Django QuerySet slicing generates a &lt;code&gt;LIMIT&lt;/code&gt; clause only when the QuerySet has not been previously evaluated. Calling &lt;code&gt;list()&lt;/code&gt;, &lt;code&gt;len()&lt;/code&gt;, or iterating the QuerySet before slicing evaluates it fully.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sequelize (Node.js):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Unbounded&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findAll&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Bounded&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findAll&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;order&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;createdAt&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DESC&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Spring Data JPA (Java):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Unbounded — loads entire collection&lt;/span&gt;
&lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Order&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;orderRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAll&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;subList&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Bounded — uses Pageable to generate LIMIT/OFFSET&lt;/span&gt;
&lt;span class="nc"&gt;Pageable&lt;/span&gt; &lt;span class="n"&gt;pageable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PageRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Sort&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;by&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"createdAt"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;descending&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
&lt;span class="nc"&gt;Page&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Order&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;orderRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findAll&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pageable&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getContent&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A practical measure during code review is to flag any application-side slicing operation (&lt;code&gt;[:10]&lt;/code&gt;, &lt;code&gt;.subList()&lt;/code&gt;, &lt;code&gt;.slice()&lt;/code&gt;, &lt;code&gt;.take()&lt;/code&gt;) that follows a database query call. These are strong indicators that limiting is happening in the wrong layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Correcting the Pattern at the API Layer
&lt;/h2&gt;

&lt;p&gt;Unbounded fetching is not only a database concern. An API endpoint that returns an unbounded collection by default has the same problem at a higher level of abstraction — it transfers more data than the client needs and places no constraint on the downstream database query.&lt;/p&gt;

&lt;p&gt;Well-designed APIs enforce explicit, bounded result sets through pagination:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /api/orders?page=1&amp;amp;limit=25
GET /api/orders?cursor=eyJpZCI6MTAwfQ==&amp;amp;limit=25
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two pagination strategies are in common use:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Offset-based pagination&lt;/strong&gt; is straightforward to implement and supports random page access, but degrades in performance at high offsets as the database must skip rows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt; &lt;span class="k"&gt;OFFSET&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cursor-based pagination&lt;/strong&gt; uses a stable reference point (typically a primary key or timestamp) to avoid the offset penalty. It is the preferred approach for large datasets and real-time data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;cursor_id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Regardless of strategy, two enforcement rules should apply at the API layer:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Default to a safe page size.&lt;/strong&gt; A request that does not specify a limit should receive a bounded default (e.g., 20 or 25 records), not the full table.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enforce a server-side maximum.&lt;/strong&gt; Client-supplied limits should be capped:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;DEFAULT_PAGE_SIZE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;
&lt;span class="n"&gt;MAX_PAGE_SIZE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_orders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;DEFAULT_PAGE_SIZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;limit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_PAGE_SIZE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM orders ORDER BY created_at DESC FETCH FIRST :n ROWS ONLY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prevents a single API call from triggering an unbounded database query regardless of what the client requests.&lt;/p&gt;




&lt;h2&gt;
  
  
  Beyond Pagination: Other Manifestations of the Pattern
&lt;/h2&gt;

&lt;p&gt;Unbounded fetching appears in contexts beyond basic list endpoints.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Existence checks:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Unbounded — fetches all matching rows to check if any exist
&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM users WHERE email = ?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="c1"&gt;# Bounded — stops at the first match
&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT 1 FROM users WHERE email = ? FETCH FIRST 1 ROW ONLY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;In-application aggregations:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Unbounded — transfers all rows to compute a sum in Python
&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT amount FROM orders WHERE customer_id = ?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;amount&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Correct — delegates aggregation to the database
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT SUM(amount) AS total FROM orders WHERE customer_id = ?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cid&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;total&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Duplicate detection:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Unbounded
&lt;/span&gt;&lt;span class="n"&gt;all_records&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;duplicates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_records&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;seen&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;seen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])]&lt;/span&gt;

&lt;span class="c1"&gt;# Correct — delegate to SQL
&lt;/span&gt;&lt;span class="n"&gt;duplicates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    SELECT id, COUNT(*) as cnt FROM events
    GROUP BY id HAVING COUNT(*) &amp;gt; 1
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In each case, the database is better equipped to perform the operation than the application layer. Delegating work to the database reduces data transfer, leverages indexes, and reduces memory consumption in the application tier.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quantifying the Impact
&lt;/h2&gt;

&lt;p&gt;To illustrate the scale of the problem, consider a table of 700,000 rows with 9 columns and an average row size of 200 bytes:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Unbounded Fetch&lt;/th&gt;
&lt;th&gt;Bounded Fetch (10 rows)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Rows transferred&lt;/td&gt;
&lt;td&gt;700,000&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Approximate data volume&lt;/td&gt;
&lt;td&gt;~140 MB&lt;/td&gt;
&lt;td&gt;~2 KB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Driver deserialization cost&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Negligible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory allocated (API server)&lt;/td&gt;
&lt;td&gt;~200–400 MB&lt;/td&gt;
&lt;td&gt;&amp;lt; 1 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Response time (single request)&lt;/td&gt;
&lt;td&gt;8–15 seconds&lt;/td&gt;
&lt;td&gt;&amp;lt; 50ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data reduction&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;99.999%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These figures represent a single request. Under concurrent load — 50, 100, or 500 simultaneous users hitting the same endpoint — the impact compounds multiplicatively across CPU, memory, and network resources.&lt;/p&gt;




&lt;h2&gt;
  
  
  Code Review Checklist
&lt;/h2&gt;

&lt;p&gt;The following signals in code review indicate potential unbounded fetch issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;fetchall()&lt;/code&gt;, &lt;code&gt;.all()&lt;/code&gt;, or &lt;code&gt;findAll()&lt;/code&gt; without an accompanying &lt;code&gt;.limit()&lt;/code&gt; or SQL &lt;code&gt;LIMIT&lt;/code&gt;/&lt;code&gt;FETCH FIRST&lt;/code&gt; clause&lt;/li&gt;
&lt;li&gt;Application-side slicing on query results: &lt;code&gt;rows[:n]&lt;/code&gt;, &lt;code&gt;.subList(0, n)&lt;/code&gt;, &lt;code&gt;.slice(0, n)&lt;/code&gt;, &lt;code&gt;.take(n)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Aggregation, sorting, or filtering logic applied to a variable that holds a full query result&lt;/li&gt;
&lt;li&gt;API endpoints that accept a &lt;code&gt;limit&lt;/code&gt; parameter but do not enforce a server-side maximum&lt;/li&gt;
&lt;li&gt;ORM calls where the QuerySet or criteria object is evaluated (via &lt;code&gt;list()&lt;/code&gt;, &lt;code&gt;len()&lt;/code&gt;, or iteration) before pagination is applied&lt;/li&gt;
&lt;li&gt;Integration tests using a dataset of fewer than 1,000 rows that have not been load-tested against production-scale data volumes&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Unbounded data fetching is a structural inefficiency that operates silently across the database, network, and application layers. It produces correct output in low-data environments, making it resistant to detection through standard testing. At production scale, it degrades response times, increases memory consumption, saturates network bandwidth, and under concurrent load, can destabilize the entire service.&lt;/p&gt;

&lt;p&gt;The remediation is consistent across all layers: push data constraints as close to the source as possible. Apply &lt;code&gt;LIMIT&lt;/code&gt;, &lt;code&gt;FETCH FIRST&lt;/code&gt;, or &lt;code&gt;TOP&lt;/code&gt; in SQL queries. Use &lt;code&gt;.limit()&lt;/code&gt; at the ORM layer rather than slicing collections in application code. Design API contracts that enforce bounded defaults and server-side maximums. Delegate aggregations, filters, and existence checks to the database rather than performing them on transferred data.&lt;/p&gt;

&lt;p&gt;The database exists to evaluate, filter, and limit data efficiently. Every row transferred beyond what the application needs is a cost that scales with your data volume and your user concurrency — and pays nothing in return.&lt;/p&gt;




&lt;p&gt;Refer the details of Orcale Performance tuning in my book &lt;a href="https://a.co/d/0dqlb969" rel="noopener noreferrer"&gt;https://a.co/d/0dqlb969&lt;/a&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>performance</category>
      <category>database</category>
      <category>oracle</category>
    </item>
  </channel>
</rss>
