<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Saurav Jha</title>
    <description>The latest articles on DEV Community by Saurav Jha (@saurav_0302).</description>
    <link>https://dev.to/saurav_0302</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3286117%2F0c3e66b3-b4b8-48c8-8139-c2a07fc9f761.jpg</url>
      <title>DEV Community: Saurav Jha</title>
      <link>https://dev.to/saurav_0302</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/saurav_0302"/>
    <language>en</language>
    <item>
      <title>Python Generator</title>
      <dc:creator>Saurav Jha</dc:creator>
      <pubDate>Sat, 21 Feb 2026 02:49:39 +0000</pubDate>
      <link>https://dev.to/saurav_0302/python-generator-5340</link>
      <guid>https://dev.to/saurav_0302/python-generator-5340</guid>
      <description>&lt;p&gt;Generator functions are a special kind of function that return a lazy iterator. These are objects that you can loop over like a list. However, unlike lists, lazy iterators do not store their contents in memory.&lt;/p&gt;

&lt;p&gt;A generator expression (also called a generator comprehension) looks almost identical to a list comprehension - but instead of creating a full list in memory, it creates a generator object that produces values lazily (one at a time).&lt;/p&gt;

&lt;p&gt;List Comprehension vs Generator Expression&lt;br&gt;
List comprehension (creates full list in memory)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;squares&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;print(squares)&lt;/p&gt;

&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[0, 1, 4, 9, 16]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All values are computed immediately&lt;br&gt;
Stored in memory&lt;/p&gt;

&lt;p&gt;Generator expression (lazy evaluation)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;squares&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;print(squares)&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;**Output: &lt;/em&gt;*&lt;br&gt;
Nothing is computed yet&lt;br&gt;
Values are generated only when needed&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to use a generator expression?&lt;/strong&gt;&lt;br&gt;
You must iterate over it&lt;br&gt;
&lt;strong&gt;for num in squares:&lt;br&gt;
    print(num)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Or convert it:&lt;br&gt;
&lt;strong&gt;print(list(squares))&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;List Comprehension&lt;/th&gt;
&lt;th&gt;Generator Expression&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Syntax&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[x for x in ...]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;(x for x in ...)&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Execution&lt;/td&gt;
&lt;td&gt;Immediate&lt;/td&gt;
&lt;td&gt;Lazy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Type&lt;/td&gt;
&lt;td&gt;list&lt;/td&gt;
&lt;td&gt;generator&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Memory Example (Important)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;

&lt;span class="n"&gt;lst&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000000&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="n"&gt;gen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getsizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# large
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getsizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# small
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Generator uses much less memory.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why "without calling a function"?&lt;/strong&gt;&lt;br&gt;
Normally, to create a generator, you'd write:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;my_generator&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But with a generator expression:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;gen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No function definition needed.&lt;/p&gt;

&lt;p&gt;Very Common Use Case&lt;br&gt;
Passing directly into functions like sum():&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice:&lt;br&gt;
No brackets&lt;br&gt;
Memory efficient&lt;br&gt;
Clean syntax&lt;/p&gt;
&lt;h2&gt;
  
  
  Python Yield Statement
&lt;/h2&gt;

&lt;p&gt;On the whole, yield is a fairly simple statement. Its primary job is to control the flow of a generator function in a way that’s similar to return statements.&lt;br&gt;
When you call a generator function or use a generator expression, you return a special iterator called a generator. You can assign this generator to a variable in order to use it. When you call special methods on the generator, such as next(), the code within the function is executed up to yield.&lt;/p&gt;

&lt;p&gt;When the Python yield statement is hit, the program suspends function execution and returns the yielded value to the caller. (In contrast, return stops function execution completely.) When a function is suspended, the state of that function is saved. This includes any variable bindings local to the generator, the instruction pointer, the internal stack, and any exception handling.&lt;/p&gt;

&lt;p&gt;yield can be used in many ways to control your generator’s execution flow. The use of multiple Python yield statements can be leveraged as far as your creativity allows.&lt;/p&gt;

&lt;p&gt;When to use generator expressions?&lt;br&gt;
✔ Large datasets&lt;br&gt;
✔ Streaming data&lt;br&gt;
✔ When you only iterate once&lt;br&gt;
✔ Memory-sensitive applications&lt;br&gt;
❌ When you need indexing or multiple passes&lt;/p&gt;

&lt;p&gt;How lazy evaluation works internally?&lt;br&gt;
Lazy evaluation means values are computed only when they are actually needed, not in advance.&lt;br&gt;
In Python, this is implemented mainly through iterators and generators.&lt;/p&gt;

&lt;p&gt;Eager vs Lazy (mental model)&lt;br&gt;
&lt;strong&gt;Eager evaluation&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What happens internally:&lt;/p&gt;

&lt;p&gt;Loop runs immediately&lt;br&gt;
All values computed&lt;br&gt;
Stored in memory as a list&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lazy evaluation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;What happens:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Nothing is computed&lt;/li&gt;
&lt;li&gt;Only a generator object is created&lt;/li&gt;
&lt;li&gt;Values are computed one at a time, on demand&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What a generator really is internally&lt;br&gt;
A generator is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A state machine&lt;/li&gt;
&lt;li&gt;With an instruction pointer&lt;/li&gt;
&lt;li&gt;And saved local variables
When Python sees yield, it:&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Pauses execution&lt;/li&gt;
&lt;li&gt;Saves local state&lt;/li&gt;
&lt;li&gt;Returns a value&lt;/li&gt;
&lt;li&gt;Resumes later from the same spot&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Step-by-step execution&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;squares&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Internal flow&lt;br&gt;
&lt;strong&gt;g = squares()   # generator created (no execution)&lt;br&gt;
next(g)         # runs until first yield → returns 0&lt;br&gt;
next(g)         # resumes → returns 1&lt;br&gt;
next(g)         # resumes → returns 4&lt;br&gt;
next(g)         # StopIteration raised&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At each next():&lt;br&gt;
Python resumes from the last saved instruction&lt;br&gt;
Executes until next yield&lt;br&gt;
Saves state again&lt;/p&gt;

&lt;p&gt;Why generators are memory-efficient?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1_000_000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stores start, stop, step only&lt;br&gt;
No list of numbers&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1_000_000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stores:&lt;br&gt;
Reference to range&lt;br&gt;
Current index&lt;br&gt;
Execution state&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory usage is constant, not proportional to size.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lazy evaluation in built-ins&lt;/strong&gt;&lt;br&gt;
Many Python functions are lazy:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;Lazy?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;range()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;map()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;filter()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;zip()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sum()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;❌ (consumes lazily)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;strong&gt;map(lambda x: x*x, range(10))&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No computation until iterated.&lt;/p&gt;

&lt;p&gt;How StopIteration ends lazy evaluation?&lt;br&gt;
When generator finishes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python raises StopIteration&lt;/li&gt;
&lt;li&gt;Iteration protocol catches it&lt;/li&gt;
&lt;li&gt;Loop stops
This is how for loops work internally.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why lazy evaluation is single-pass?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once consumed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;gen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# [0, 1, 2]
&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# []
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why?&lt;br&gt;
State machine has reached the end&lt;br&gt;
No reset unless recreated&lt;/p&gt;

&lt;p&gt;“Lazy evaluation in Python works by using iterators and generators, which compute values only when requested. Internally, a generator is a state machine that pauses execution at each yield, saves its local state, and resumes later. This allows Python to process large or infinite data streams efficiently with constant memory usage.”&lt;/p&gt;

&lt;p&gt;Python generators support advanced control methods that let you send data into, raise exceptions inside, and terminate a generator from the outside.&lt;/p&gt;

&lt;p&gt;These are:&lt;/p&gt;

&lt;p&gt;.send(value)&lt;br&gt;
.throw(exception)&lt;br&gt;
.close()&lt;/p&gt;

&lt;p&gt;.send(value) — send data into a generator&lt;/p&gt;

&lt;p&gt;Normally, generators only yield values out.&lt;br&gt;
.send() allows you to send a value back into the generator, which becomes the result of the last yield expression.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="n"&gt;gen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;       &lt;span class="c1"&gt;# start generator → yields 0
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;    &lt;span class="c1"&gt;# sends 10 into generator → yields 11
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;    &lt;span class="c1"&gt;# yields 21
&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key rules&lt;br&gt;
First call must be next(gen) or gen.send(None)&lt;br&gt;
send(x) assigns x to the last yield expression&lt;/p&gt;

&lt;p&gt;.throw() — raise an exception inside the generator&lt;br&gt;
.throw() injects an exception at the point where the generator is paused.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generator&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;running&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ValueError handled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;gen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generator&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;                 &lt;span class="c1"&gt;# running
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;throw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;     &lt;span class="c1"&gt;# ValueError handled
&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use cases&lt;/p&gt;

&lt;p&gt;Cancel work&lt;br&gt;
Signal error conditions&lt;br&gt;
Interrupt long-running generators&lt;br&gt;
If the generator does not catch the exception → it propagates outward.&lt;/p&gt;

&lt;p&gt;.close() — stop the generator gracefully&lt;br&gt;
.close() raises a GeneratorExit inside the generator.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generator&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;working&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cleaning up resources&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;gen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generator&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;gen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;br&gt;
working&lt;br&gt;
Cleaning up resources&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;
&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;FastAPI&lt;/span&gt; &lt;span class="n"&gt;dependency&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="n"&gt;yields&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Automatically&lt;/span&gt; &lt;span class="n"&gt;commits&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;no&lt;/span&gt; &lt;span class="n"&gt;exception&lt;/span&gt; &lt;span class="n"&gt;occurs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;rolls&lt;/span&gt; &lt;span class="n"&gt;back&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;an&lt;/span&gt; &lt;span class="n"&gt;exception&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;raised&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AsyncGenerator&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AsyncSession&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;SessionLocal&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
       &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
           &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;
           &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
       &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
           &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rollback&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Central rollback
&lt;/span&gt;           &lt;span class="k"&gt;raise&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lifecycle Summary&lt;br&gt;
Method  Purpose&lt;br&gt;
next()  Resume generator&lt;br&gt;
send(x) Resume + send value&lt;br&gt;
throw(e)    Resume + raise exception&lt;br&gt;
close() Terminate generator&lt;/p&gt;

&lt;h2&gt;
  
  
  To Summarize
&lt;/h2&gt;

&lt;p&gt;A generator in Python is a function that:&lt;br&gt;
Uses the yield keyword&lt;br&gt;
Produces values one at a time&lt;br&gt;
Remembers its state between executions&lt;br&gt;
Unlike a normal function, it does not return all results at once.&lt;/p&gt;

&lt;p&gt;When Python sees yield:&lt;br&gt;
It returns the value&lt;br&gt;
Pauses execution&lt;br&gt;
Saves local state&lt;br&gt;
Resumes from that point on next iteration&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>programming</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Dual write problem in distributed systems</title>
      <dc:creator>Saurav Jha</dc:creator>
      <pubDate>Mon, 29 Dec 2025 12:05:30 +0000</pubDate>
      <link>https://dev.to/saurav_0302/dual-write-problem-in-distributed-systems-51o7</link>
      <guid>https://dev.to/saurav_0302/dual-write-problem-in-distributed-systems-51o7</guid>
      <description>&lt;p&gt;The dual write problem is a classic issue that arises in distributed systems when a single logical operation needs to update two (or more) separate systems or data stores — for example, writing to a database and sending an event/message to a message broker (like Kafka).&lt;/p&gt;

&lt;p&gt;Because these systems are independent, ensuring atomicity (all-or-nothing behavior) across them is extremely difficult without a distributed transaction protocol. Let’s break it down clearly:&lt;/p&gt;

&lt;p&gt;Distributed transaction protocol:&lt;br&gt;
A distributed transaction protocol is a mechanism that ensures atomicity (all-or-nothing behavior) for a transaction that spans multiple independent systems, such as multiple databases, services, or message brokers.&lt;br&gt;
In other words, It makes several different systems behave as if they were performing a single, unified transaction.&lt;br&gt;
&lt;strong&gt;Consensus-based (Paxos/Raft)&lt;/strong&gt; guarantees strong consistency per state machine widely used in distributed DBs, config stores.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TrueTime + Paxos&lt;/strong&gt; guarantees global ACID widely used in Google Spanner.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Microservices use:&lt;/strong&gt;&lt;br&gt;
Saga Pattern for business-level distributed workflows&lt;br&gt;
Transactional Outbox Pattern for local atomicity&lt;br&gt;
Idempotency + retries&lt;br&gt;
These approaches trade strict ACID for eventual consistency + resilience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Scenario&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Suppose you have a user service that:&lt;br&gt;
Stores user data in PostgreSQL.&lt;br&gt;
Publishes a “UserCreated” event to Kafka.&lt;/p&gt;

&lt;p&gt;The naive (and common) approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BEGIN
INSERT INTO users (id, name) VALUES (...);
SEND "UserCreated" EVENT TO Kafka;
COMMIT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the database insert succeeds but the Kafka send fails (or vice versa), your systems become inconsistent — one reflects the change, the other doesn’t.&lt;/p&gt;

&lt;p&gt;This is the dual write problem — trying to atomically update two systems that don’t share a transaction coordinator.&lt;br&gt;
There’s no global transaction manager ensuring both operations succeed or fail together.&lt;br&gt;
Failures (network issues, process crashes, retries) can easily cause partial updates.&lt;br&gt;
Retrying can lead to duplicates or out-of-order events.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Core Problem&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the dual write problem — trying to atomically update two systems that don’t share a transaction coordinator.&lt;/p&gt;

&lt;p&gt;There’s no global transaction manager ensuring both operations succeed or fail together.&lt;/p&gt;

&lt;p&gt;Failures (network issues, process crashes, retries) can easily cause partial updates.&lt;/p&gt;

&lt;p&gt;Retrying can lead to duplicates or out-of-order events.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consequences&lt;/strong&gt;&lt;br&gt;
&lt;u&gt;Failure Scenario&lt;/u&gt;&lt;br&gt;
DB write succeeds, message send fails &lt;strong&gt;results -&amp;gt;&lt;/strong&gt;State exists in DB but no event emitted — downstream systems never learn of it.&lt;br&gt;
Message send succeeds, DB write fails &lt;strong&gt;results -&amp;gt;&lt;/strong&gt;Event emitted for data that doesn’t exist — consumers act on invalid state.&lt;br&gt;
Retry logic applied incorrectly &lt;strong&gt;results -&amp;gt;&lt;/strong&gt;Duplicate events or multiple DB inserts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Common Solutions / Patterns&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Transactional Outbox Pattern&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Write the event into the same database as the business data, in the same transaction.&lt;/p&gt;

&lt;p&gt;A background process (or CDC tool like Debezium) later reads the “outbox” table and publishes to Kafka.&lt;/p&gt;

&lt;p&gt;Guarantees consistency between the DB and the emitted events.&lt;/p&gt;

&lt;p&gt;✅ Pros:&lt;/p&gt;

&lt;p&gt;Strong consistency between DB and message.&lt;/p&gt;

&lt;p&gt;Simple to implement if you control both DB and messaging.&lt;/p&gt;

&lt;p&gt;🚫 Cons:&lt;/p&gt;

&lt;p&gt;Adds complexity and operational overhead.&lt;/p&gt;

&lt;p&gt;Requires deduplication on consumer side.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Change Data Capture (CDC)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Instead of manually writing to Kafka, rely on a CDC tool (e.g., Debezium, Oracle GoldenGate).&lt;/p&gt;

&lt;p&gt;It monitors database changes and automatically publishes events when rows change.&lt;/p&gt;

&lt;p&gt;✅ Pros:&lt;/p&gt;

&lt;p&gt;No dual write logic in app.&lt;/p&gt;

&lt;p&gt;Strong consistency if CDC is reliable.&lt;/p&gt;

&lt;p&gt;🚫 Cons:&lt;/p&gt;

&lt;p&gt;Possible event lag.&lt;/p&gt;

&lt;p&gt;Requires reliable CDC infra and schema stability.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Idempotent &amp;amp; Retry-Safe Design&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Make operations idempotent (safe to retry).&lt;/p&gt;

&lt;p&gt;Use unique request IDs and deduplication to avoid inconsistent states even if writes are repeated.&lt;/p&gt;

&lt;p&gt;✅ Pros:&lt;/p&gt;

&lt;p&gt;Works across many systems.&lt;/p&gt;

&lt;p&gt;🚫 Cons:&lt;/p&gt;

&lt;p&gt;Still requires careful design.&lt;/p&gt;

&lt;p&gt;Doesn’t solve ordering issues.&lt;/p&gt;

&lt;p&gt;The Transactional Outbox Solution&lt;/p&gt;

&lt;p&gt;Instead of writing directly to Kafka, you:&lt;/p&gt;

&lt;p&gt;Write both the business data and the event to the same database transaction.&lt;/p&gt;

&lt;p&gt;A background Outbox Processor (or CDC tool) later reads the outbox table and safely publishes events to Kafka.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CREATE TABLE orders (
    id UUID PRIMARY KEY,
    customer_id UUID NOT NULL,
    total NUMERIC NOT NULL,
    created_at TIMESTAMP DEFAULT now()
);

CREATE TABLE outbox (
    id UUID PRIMARY KEY,
    event_type TEXT NOT NULL,
    payload JSONB NOT NULL,
    created_at TIMESTAMP DEFAULT now(),
    published BOOLEAN DEFAULT FALSE
);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Application Transaction (Atomic Write)&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import uuid
import json
from psycopg2 import connect
conn = connect(...)
cur = conn.cursor()
order_id = uuid.uuid4()

try:
    # Both writes happen in one transaction
    cur.execute(
        INSERT INTO orders (id, customer_id, total) 
        VALUES (%s, %s, %s)
    , (order_id, "c123", 100.0))

    event = {
        "event_id": str(uuid.uuid4()),
        "type": "OrderCreated",
        "order_id": str(order_id)
    }

    cur.execute(
        INSERT INTO outbox (id, event_type, payload)
        VALUES (%s, %s, %s),(event["event_id"], event["type"], json.dumps(event)))

    conn.commit()  # ✅ both are saved atomically
except Exception as e:
    conn.rollback()
    raise e
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this point:&lt;br&gt;
The order exists in the DB.&lt;br&gt;
The event is stored, but not yet published to Kafka.&lt;br&gt;
No dual write risk, because both were done in one atomic transaction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Outbox Processor (Async Publisher)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A small background worker or CDC tool continuously scans for new events. If Kafka or your app crashes mid-publish, the event remains in the outbox and will be retried safely.&lt;/p&gt;

&lt;p&gt;Goal: Prevent inconsistency when writing to a database and publishing events to a message broker (fixes the dual write problem).&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>microservices</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Design ecommerce platform like Amazon, Flipkart</title>
      <dc:creator>Saurav Jha</dc:creator>
      <pubDate>Fri, 05 Dec 2025 18:52:39 +0000</pubDate>
      <link>https://dev.to/saurav_0302/design-ecommerce-platform-like-amazon-flipkart-1261</link>
      <guid>https://dev.to/saurav_0302/design-ecommerce-platform-like-amazon-flipkart-1261</guid>
      <description>&lt;p&gt;Start with functional requirements with constraint&lt;br&gt;
&lt;strong&gt;User login/registration/manage profiles&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;user should be able to search the product based on product name&lt;br&gt;
user should be able to view the details of products&lt;br&gt;
user should be able to add product to cart&lt;br&gt;
user should be able to do the checkout and payment&lt;br&gt;
System should notify user once order is placed&lt;br&gt;
user should be able to track the order status&lt;br&gt;
manage purchase of item having limited stock&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Non functional requirements&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Per day order processing 100K&lt;br&gt;
Platform should be highly available during increased workload&lt;br&gt;
Platform should ensure consistency and prevent duplicate order/ payment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Make security a first-class requirement&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Prevent bot/ DDoS attack on the platform&lt;br&gt;
Only registered/ logged in user can place order&lt;/p&gt;

&lt;p&gt;Calculate transaction per second(TPS) roughly&lt;br&gt;
Its’s critical metric in system design, impacting decisions about server capacity, database design, network infrastructure, and more.&lt;br&gt;
A purchase flow typically triggers many more reads (catalog read, inventory check, pricing, user profile, coupon validation) needs to be considered for TPS calculations but can be descoped during interview discussion.&lt;/p&gt;

&lt;p&gt;Non functional requirement&lt;br&gt;
10M monthly active user&lt;br&gt;
system should be highly available with respect to searching and viewing the products&lt;br&gt;
and highly consistent with respect to placing the order and payment&lt;br&gt;
latency ~200ms&lt;/p&gt;

&lt;p&gt;Calculating TPS&lt;br&gt;
10M daily active users&lt;br&gt;
1 order per user/ daily&lt;br&gt;
Transactions per Purchase: Each purchase involves 3 transactions (item selection, payment processing, order confirmation).&lt;br&gt;
1,000,0000  users × 1 purchases/user/day × 3 transactions/purchase = 30 million transactions/day.&lt;/p&gt;

&lt;p&gt;Converting to TPS:&lt;br&gt;
There are 86,400 seconds in a day.&lt;br&gt;
TPS = Total daily transactions / Seconds in a day.&lt;br&gt;
TPS= 30 * 10 power 6 / 10 power 5 = 300 TPS&lt;/p&gt;

&lt;p&gt;I'm approximating 86,400 seconds/day as 100k for mental calculation.&lt;/p&gt;

&lt;p&gt;Design the architecture to satisfy these constraints&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4d5yquy2a3k9opqebdb8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4d5yquy2a3k9opqebdb8.png" alt=" " width="800" height="1135"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>interview</category>
      <category>security</category>
      <category>systemdesign</category>
    </item>
    <item>
      <title>Alembic basics to advance</title>
      <dc:creator>Saurav Jha</dc:creator>
      <pubDate>Fri, 08 Aug 2025 10:33:28 +0000</pubDate>
      <link>https://dev.to/saurav_0302/alembic-basics-to-advance-3edm</link>
      <guid>https://dev.to/saurav_0302/alembic-basics-to-advance-3edm</guid>
      <description>&lt;p&gt;Alembic is a Python tool that integrates with the SQLAlchemy ORM to apply model changes to relational databases such as PostgreSQL, MySQL, and Oracle. It supports both online migrations (recommended for development and UAT environments) and offline migrations (recommended for production). However, Alembic does not support NoSQL databases like MongoDB or DynamoDB.&lt;/p&gt;




&lt;p&gt;For example you have a db model that you would like to migrate&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8awur4r2zd27uj9qhvwa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8awur4r2zd27uj9qhvwa.png" alt=" " width="800" height="512"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;run the below command in the terminal&lt;br&gt;
&lt;strong&gt;pip install alembic&lt;br&gt;
alembic init migrations&lt;/strong&gt;&lt;br&gt;
This command creates a migrations directory and generates an alembic.ini file&lt;/p&gt;

&lt;p&gt;You will find the migrations folder and the alembic.ini file created in the project directory.&lt;br&gt;
Next, open the env.py file inside the migrations folder and add the DB_CONNECTION and target_metadata variables as shown in the code snippet below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuciy39sm7k5zzrm8lbpp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuciy39sm7k5zzrm8lbpp.png" alt=" " width="800" height="593"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Please note: In this example, I've used SQLModel, which is a wrapper around SQLAlchemy. If you're working with SQLAlchemy directly, make sure to use SQLAlchemy's own metadata.&lt;/p&gt;

&lt;p&gt;Once done run the below command to generate the initial revision file.&lt;br&gt;
&lt;strong&gt;alembic revision --autogenerate -m "initial migration"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Go to migrations &amp;gt; version &amp;gt; revision file and verify the auto generated code. Resolve the error if any.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frb08wzm66ylka2t93oek.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frb08wzm66ylka2t93oek.png" alt=" " width="800" height="667"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Run the below command and verify the generated tables in database&lt;br&gt;
&lt;strong&gt;alembic upgrade head&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa4nfr7l7xptaauqpzdrm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa4nfr7l7xptaauqpzdrm.png" alt=" " width="457" height="372"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In my case table were created successfully.&lt;/p&gt;

&lt;p&gt;Suppose you want to remove a column from a table.&lt;br&gt;
To do this, go to the database model and delete the corresponding field.&lt;br&gt;
In the code snippet below, I’ve removed the gender field as an example.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7lnbyf9brzcj3xsbzvvk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7lnbyf9brzcj3xsbzvvk.png" alt=" " width="800" height="487"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Execute the following command. It will generate a revision file, as illustrated below.&lt;/p&gt;

&lt;p&gt;alembic revision --autogenerate -m "remove gender"&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfj22b3wyp65skxxnoo4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhfj22b3wyp65skxxnoo4.png" alt=" " width="800" height="331"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Run the below command to apply the changes.&lt;br&gt;
&lt;strong&gt;alembic upgrade head&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;def upgrade() -&amp;gt; None:&lt;br&gt;
def downgrade() -&amp;gt; None:(to remove it on rollback)&lt;br&gt;
Both the methods are crucial to perform DB upgrade and downgrade.&lt;/p&gt;




&lt;p&gt;Scenarios Where You Use upgrade()&lt;br&gt;
Adding New Tables&lt;br&gt;
Example: Introducing a new projects table.&lt;br&gt;
upgrade() → creates the table.&lt;br&gt;
Adding a New Column&lt;br&gt;
Example: Adding a salary column to the employee table.&lt;br&gt;
upgrade() → uses op.add_column.&lt;br&gt;
Changing Column Type or Constraints&lt;br&gt;
Example: Increasing phone column size.&lt;br&gt;
upgrade() → uses op.alter_column.&lt;br&gt;
Adding Foreign Keys / Indexes&lt;br&gt;
Example: Linking salary.employee_id to employee.id.&lt;br&gt;
upgrade() → uses op.create_foreign_key.&lt;br&gt;
Renaming Tables or Columns&lt;br&gt;
Example: Renaming job_title to position.&lt;br&gt;
upgrade() → uses op.alter_column(new_column_name=...).&lt;/p&gt;




&lt;p&gt;Scenarios Where You Use downgrade()&lt;br&gt;
downgrade() is the exact reverse of upgrade(), used when:&lt;br&gt;
Rolling Back a Failed Deployment&lt;br&gt;
You deployed a migration that caused an issue.&lt;br&gt;
downgrade() → safely reverts to the previous state.&lt;br&gt;
Reverting Experimental Features&lt;br&gt;
Example: You added a gender column but later decide to remove it.&lt;br&gt;
downgrade() → drops the column.&lt;br&gt;
Undoing Schema Changes During Development&lt;br&gt;
Example: You added an index but want to test performance without it.&lt;br&gt;
downgrade() → drops the index.&lt;br&gt;
Synchronizing Database With Older Code Versions&lt;br&gt;
You need to roll back to an earlier release of your application.&lt;br&gt;
downgrade() → brings DB schema back in sync with that release.&lt;/p&gt;

&lt;p&gt;You add a gender column (upgrade).&lt;br&gt;
Then business decides not to store gender.&lt;br&gt;
You write another migration where:&lt;br&gt;
upgrade() → drops gender column.&lt;br&gt;
downgrade() → re-adds it (in case you need to revert).&lt;/p&gt;

&lt;p&gt;Run the command to re-add the column.&lt;br&gt;
alembic downgrade 678f33609bf3&lt;br&gt;
Make sure you use down revision number&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flhzrc89jyjipbquody15.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flhzrc89jyjipbquody15.png" alt=" " width="800" height="493"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To revert one step:&lt;br&gt;
&lt;strong&gt;alembic downgrade -1&lt;/strong&gt;&lt;br&gt;
To revert to the base (empty schema):&lt;br&gt;
&lt;strong&gt;alembic downgrade base&lt;/strong&gt;&lt;br&gt;
Best Practice for Downgrades&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Always reverse changes in the opposite order of upgrade().&lt;/li&gt;
&lt;li&gt;If your upgrade() creates something, downgrade() should drop it.&lt;/li&gt;
&lt;li&gt;If your upgrade() drops something, downgrade() should recreate it.&lt;/li&gt;
&lt;li&gt;Always include foreign key handling when applicable.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>python</category>
      <category>tooling</category>
      <category>database</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Quorum and Consensus in Distributed Systems:</title>
      <dc:creator>Saurav Jha</dc:creator>
      <pubDate>Sun, 13 Jul 2025 11:17:05 +0000</pubDate>
      <link>https://dev.to/saurav_0302/quorum-and-consensus-in-distributed-systems-3obe</link>
      <guid>https://dev.to/saurav_0302/quorum-and-consensus-in-distributed-systems-3obe</guid>
      <description>&lt;p&gt;In distributed systems, quorum and consensus are key to ensuring consistency, availability, and fault tolerance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quorum&lt;/strong&gt;: A quorum refers to the minimum number of nodes required to perform operations reliably. Typically, it is defined as the majority of nodes in a peer group. For a system with N nodes, a quorum is achieved when at least (N/2) + 1 nodes agree.&lt;/p&gt;

&lt;p&gt;For example, in a 5-node cluster, at least 3 nodes must participate to form a quorum. If this number is not met due to node failures or network issues, the system becomes unavailable, and no new operations—such as log commits can proceed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consensus:&lt;/strong&gt; Consensus is the process through which distributed nodes agree on a single, consistent state of data. It serves as the backbone for synchronization and coordination in distributed systems, ensuring that all nodes maintain the same view of shared data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Common Use Cases of Quorum:&lt;/strong&gt;&lt;br&gt;
Distributed Databases Systems like Cassandra and DynamoDB rely on quorum-based mechanisms for consistency.&lt;/p&gt;

&lt;p&gt;Cassandra uses the Paxos consensus algorithm to ensure that data is consistently replicated and agreed upon.&lt;/p&gt;

&lt;p&gt;DynamoDB implements quorum-based replication to balance availability and consistency.&lt;br&gt;
Learn more about Cassandra's Paxos protocol:&lt;br&gt;
&lt;a href="https://lnkd.in/gh4bnFfw" rel="noopener noreferrer"&gt;https://lnkd.in/gh4bnFfw&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Distributed File Systems:&lt;/strong&gt; Quorum is also crucial in file systems for ensuring consistent file operations across nodes.&lt;br&gt;
Google’s Chubby lock service, which supports the Google File System (GFS), uses the Paxos consensus algorithm to maintain high availability and consistency.&lt;br&gt;
Worth reading the paper on Google’s File System&lt;br&gt;
&lt;a href="https://lnkd.in/gdHwFN9r" rel="noopener noreferrer"&gt;https://lnkd.in/gdHwFN9r&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;*&lt;strong&gt;&lt;em&gt;Popular Consensus Algorithms&lt;/em&gt;&lt;/strong&gt;*&lt;br&gt;
Raft&lt;br&gt;
Paxos&lt;br&gt;
PBFT (Practical Byzantine Fault Tolerance)&lt;br&gt;
In-depth overview of the Raft algorithm&lt;br&gt;
&lt;a href="https://lnkd.in/gB4u43mi" rel="noopener noreferrer"&gt;https://lnkd.in/gB4u43mi&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
