<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Muhammad Sufiyan Baig</title>
    <description>The latest articles on DEV Community by Muhammad Sufiyan Baig (@muhammadsufiyanbaig).</description>
    <link>https://dev.to/muhammadsufiyanbaig</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1890831%2Ffe22af1a-e380-4c8a-b88f-426be0d0033d.png</url>
      <title>DEV Community: Muhammad Sufiyan Baig</title>
      <link>https://dev.to/muhammadsufiyanbaig</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/muhammadsufiyanbaig"/>
    <language>en</language>
    <item>
      <title>5 Python Tricks Every Backend Dev Should Know</title>
      <dc:creator>Muhammad Sufiyan Baig</dc:creator>
      <pubDate>Thu, 25 Jun 2026 17:30:44 +0000</pubDate>
      <link>https://dev.to/muhammadsufiyanbaig/5-python-tricks-every-backend-dev-should-know-2bki</link>
      <guid>https://dev.to/muhammadsufiyanbaig/5-python-tricks-every-backend-dev-should-know-2bki</guid>
      <description>&lt;p&gt;As backend developers, we're constantly seeking ways to optimize our workflow and improve the quality of our code. One of the most effective methods to achieve this is by leveraging lesser-known &lt;strong&gt;Python&lt;/strong&gt; features that can significantly enhance our development routine. Despite being experienced developers, many of us may not be aware of the various &lt;strong&gt;Python&lt;/strong&gt; tricks that can boost our productivity and streamline our code. In this article, we'll delve into five key &lt;strong&gt;Python&lt;/strong&gt; features that every backend developer should know, exploring how they can be incorporated into daily development to improve efficiency and code quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  Efficient String Formatting with F-Strings
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;F-strings&lt;/strong&gt; were introduced in &lt;strong&gt;Python 3.6&lt;/strong&gt; as a more efficient and readable way of formatting strings. They provide a concise and expressive syntax for embedding expressions inside string literals, using the &lt;code&gt;f&lt;/code&gt; prefix before the string. However, many developers are not utilizing the full potential of &lt;strong&gt;f-strings&lt;/strong&gt; by combining them with &lt;strong&gt;format specs&lt;/strong&gt;. This allows for more precise control over the formatting of values, making the code more readable and maintainable. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;John&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; is &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; years old&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Basic f-string usage
&lt;/span&gt;&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; is &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;02&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; years old&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Using format spec for padding
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the above example, the &lt;code&gt;:02d&lt;/code&gt; &lt;strong&gt;format spec&lt;/strong&gt; is used to pad the &lt;code&gt;age&lt;/code&gt; value with a leading zero if it's less than 10. This demonstrates how &lt;strong&gt;f-strings&lt;/strong&gt; can be used with &lt;strong&gt;format specs&lt;/strong&gt; to create more sophisticated string formatting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Simplified Assignments and Exception Handling
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;walrus operator&lt;/strong&gt; (&lt;code&gt;:=&lt;/code&gt;) was introduced in &lt;strong&gt;Python 3.8&lt;/strong&gt; as a way to simplify assignments within conditional statements. This operator allows you to assign a value to a variable as part of a conditional statement, making the code more concise and readable. Another useful feature for exception handling is &lt;strong&gt;contextlib.suppress&lt;/strong&gt;, which provides a context manager that suppresses specific exceptions within a block of code. This can be particularly useful when working with external libraries or APIs that raise exceptions that you want to ignore. Here's an example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;contextlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;suppress&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;suppress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;example.txt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content length: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, the &lt;strong&gt;walrus operator&lt;/strong&gt; is used to assign the length of the &lt;code&gt;content&lt;/code&gt; string to the variable &lt;code&gt;n&lt;/code&gt; within the conditional statement. The &lt;strong&gt;contextlib.suppress&lt;/strong&gt; context manager is used to suppress the &lt;strong&gt;FileNotFoundError&lt;/strong&gt; exception that might be raised when trying to open the file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Optimizing Data Storage with Dataclass Slots
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Dataclasses&lt;/strong&gt; were introduced in &lt;strong&gt;Python 3.7&lt;/strong&gt; as a way to simplify the creation of classes that mainly contain data. One of the lesser-known features of &lt;strong&gt;dataclasses&lt;/strong&gt; is the use of &lt;strong&gt;slots&lt;/strong&gt;, which can optimize memory usage by preventing the creation of a &lt;code&gt;__dict__&lt;/code&gt; attribute for each instance. This can be particularly useful when working with large datasets or performance-critical code. Here's an example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dataclasses&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dataclass&lt;/span&gt;

&lt;span class="nd"&gt;@dataclass&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;slots&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;

&lt;span class="n"&gt;person&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Person&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;John&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Accessing the name attribute
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, the &lt;strong&gt;dataclass&lt;/strong&gt; decorator is used with the &lt;code&gt;slots=True&lt;/code&gt; argument to prevent the creation of a &lt;code&gt;__dict__&lt;/code&gt; attribute for each instance of the &lt;code&gt;Person&lt;/code&gt; class. This can help optimize memory usage and improve performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Improving Performance with Functools.Cache
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Functools.cache&lt;/strong&gt; was introduced in &lt;strong&gt;Python 3.9&lt;/strong&gt; as a way to cache the results of function calls, improving performance by avoiding redundant computations. This can be particularly useful when working with expensive function calls or recursive algorithms. Here's an example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;functools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;

&lt;span class="nd"&gt;@cache&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fibonacci&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;fibonacci&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;fibonacci&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;fibonacci&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# Calculating the 10th Fibonacci number
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, the &lt;strong&gt;functools.cache&lt;/strong&gt; decorator is used to cache the results of the &lt;code&gt;fibonacci&lt;/code&gt; function, avoiding redundant computations and improving performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion and Key Takeaways
&lt;/h2&gt;

&lt;p&gt;Incorporating these five &lt;strong&gt;Python&lt;/strong&gt; features into your daily development routine can significantly improve your workflow and code quality. By utilizing &lt;strong&gt;f-strings&lt;/strong&gt; with &lt;strong&gt;format specs&lt;/strong&gt;, leveraging the &lt;strong&gt;walrus operator&lt;/strong&gt;, implementing &lt;strong&gt;contextlib.suppress&lt;/strong&gt;, optimizing data storage with &lt;strong&gt;dataclass slots&lt;/strong&gt;, and improving performance with &lt;strong&gt;functools.cache&lt;/strong&gt;, you can write more efficient, readable, and maintainable code. Remember to always keep exploring and learning about new &lt;strong&gt;Python&lt;/strong&gt; features and tricks to stay up-to-date with the latest developments in the language. By doing so, you'll be able to take your backend development skills to the next level and deliver high-quality solutions with ease.&lt;/p&gt;

</description>
      <category>python</category>
      <category>backend</category>
      <category>tips</category>
    </item>
    <item>
      <title>REST vs GraphQL: When to Use Which</title>
      <dc:creator>Muhammad Sufiyan Baig</dc:creator>
      <pubDate>Mon, 22 Jun 2026 09:13:17 +0000</pubDate>
      <link>https://dev.to/muhammadsufiyanbaig/rest-vs-graphql-when-to-use-which-15n2</link>
      <guid>https://dev.to/muhammadsufiyanbaig/rest-vs-graphql-when-to-use-which-15n2</guid>
      <description>&lt;p&gt;When designing a web application, one of the most critical decisions developers face is choosing the right API architecture. With the rise of &lt;strong&gt;REST (Representational State of Resource)&lt;/strong&gt; and &lt;strong&gt;GraphQL&lt;/strong&gt;, two popular approaches have emerged, each with its strengths and weaknesses. However, the choice between these two architectures is not a one-size-fits-all solution, and understanding the trade-offs between them is essential for making an informed decision in production environments. In this article, we will delve into the world of REST and GraphQL, exploring their characteristics, use cases, and limitations, to help developers make informed decisions about which approach to use in their applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction to REST and GraphQL
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;REST&lt;/strong&gt;, introduced by Roy Fielding in 2000, is an architectural style for designing networked applications. It is based on the idea of resources, which are identified by URIs, and can be manipulated using a fixed set of operations. REST is a simple, widely adopted, and well-established approach that has been used in countless web applications. On the other hand, &lt;strong&gt;GraphQL&lt;/strong&gt;, developed by Facebook in 2015, is a query language for APIs that allows clients to specify exactly what data they need, reducing the amount of data transferred over the network. GraphQL provides a more flexible and efficient way of fetching data, especially in complex and nested data scenarios.&lt;/p&gt;

&lt;h2&gt;
  
  
  REST: Simplicity and Caching
&lt;/h2&gt;

&lt;p&gt;REST's simplicity and caching capabilities make it an attractive choice for simple, read-heavy applications. With REST, each resource is identified by a unique URI, and clients can use standard HTTP methods (GET, POST, PUT, DELETE) to interact with these resources. This simplicity makes it easy to implement and understand, even for developers without extensive experience. Additionally, REST's use of HTTP methods and status codes provides a built-in caching mechanism, which can significantly improve performance in read-heavy applications. For example, a simple REST API for retrieving a list of users might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="nf"&gt;GET&lt;/span&gt; &lt;span class="nn"&gt;/users&lt;/span&gt; &lt;span class="k"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt;
&lt;span class="na"&gt;Host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;example.com&lt;/span&gt;
&lt;span class="na"&gt;Accept&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application/json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This simplicity and caching capability make REST a great choice for applications with simple data needs, such as a blog or a news website.&lt;/p&gt;

&lt;h2&gt;
  
  
  GraphQL: Flexibility and Complex Data Handling
&lt;/h2&gt;

&lt;p&gt;GraphQL's flexibility and ability to handle complex, nested data queries make it an ideal choice for applications with intricate data needs. With GraphQL, clients can specify exactly what data they need, reducing the amount of data transferred over the network. This is particularly useful in applications with complex, nested data structures, such as social media platforms or e-commerce websites. For example, a GraphQL query for retrieving a user's profile information, including their friends and posts, might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight graphql"&gt;&lt;code&gt;&lt;span class="k"&gt;query&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="n"&gt;friends&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query allows the client to specify exactly what data they need, reducing the amount of data transferred over the network and improving performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing Between REST and GraphQL
&lt;/h2&gt;

&lt;p&gt;The choice between REST and GraphQL depends on the specific requirements of the application and the trade-offs between simplicity, performance, and flexibility. If the application has simple data needs and is read-heavy, REST might be a better choice due to its simplicity and caching capabilities. On the other hand, if the application has complex, nested data needs, GraphQL might be a better choice due to its flexibility and ability to handle complex data queries. Here are some key factors to consider when choosing between REST and GraphQL:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data complexity&lt;/strong&gt;: If the application has simple data needs, REST might be a better choice. If the application has complex, nested data needs, GraphQL might be a better choice.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: If the application is read-heavy, REST's caching capabilities might provide better performance. If the application has complex data queries, GraphQL's ability to reduce data transfer might provide better performance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Development complexity&lt;/strong&gt;: If the application has a simple architecture, REST might be easier to implement. If the application has a complex architecture, GraphQL might require more development effort.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;In conclusion, the choice between REST and GraphQL depends on the specific needs of the application, and understanding the strengths and weaknesses of each approach is essential for making an informed decision in production environments. By weighing the trade-offs between REST and GraphQL, developers can make informed decisions about which approach to use in their applications, leading to more efficient, scalable, and maintainable software systems. Whether you choose REST or GraphQL, the key is to understand the use cases and limitations of each approach and to select the one that best fits your application's requirements. Ultimately, a well-designed API architecture is critical to the success of any web application, and choosing the right approach is the first step towards building a robust, scalable, and maintainable system.&lt;/p&gt;

</description>
      <category>api</category>
      <category>rest</category>
      <category>graphql</category>
    </item>
    <item>
      <title>Stop Hand-Tuning ETL Batch Sizes. Use PID Control Instead.</title>
      <dc:creator>Muhammad Sufiyan Baig</dc:creator>
      <pubDate>Tue, 31 Mar 2026 11:08:49 +0000</pubDate>
      <link>https://dev.to/muhammadsufiyanbaig/stop-hand-tuning-etl-batch-sizes-use-pid-control-instead-103d</link>
      <guid>https://dev.to/muhammadsufiyanbaig/stop-hand-tuning-etl-batch-sizes-use-pid-control-instead-103d</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzvmikuv9g68vgzrd0d1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhzvmikuv9g68vgzrd0d1.png" alt=" "&gt;&lt;/a&gt;&lt;br&gt;
You've done this before. You need to batch-process a large dataset. You pick a chunk size — maybe &lt;code&gt;1000&lt;/code&gt;, maybe &lt;code&gt;10000&lt;/code&gt; — run a quick test, it looks fine, and you ship it.&lt;/p&gt;

&lt;p&gt;Three weeks later, your pipeline is crawling at 15% CPU while you're paying for 8 cores. Or it's randomly OOM-crashing on Tuesday nights when the dataset is slightly wider than usual.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;static batch size problem&lt;/strong&gt;, and it's more expensive than most teams realize.&lt;/p&gt;


&lt;h2&gt;
  
  
  What's actually happening
&lt;/h2&gt;

&lt;p&gt;When you hard-code a batch size, you're making a bet:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"This number will be optimal on every run, on every machine, under every memory condition, forever."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's never true. The optimal chunk size is a function of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Current available memory&lt;/li&gt;
&lt;li&gt;How heavy the transformation is for &lt;em&gt;this&lt;/em&gt; batch&lt;/li&gt;
&lt;li&gt;How many other jobs are competing for resources&lt;/li&gt;
&lt;li&gt;Row width variation in the dataset&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No static number wins across all these dimensions. You need &lt;strong&gt;continuous adaptation&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  Enter PID control
&lt;/h2&gt;

&lt;p&gt;PID (Proportional-Integral-Derivative) control is a feedback algorithm used in virtually every control system on the planet — thermostats, drone stabilizers, industrial robots, cruise control.&lt;/p&gt;

&lt;p&gt;The idea: measure the current output, compare it to the target, and adjust the control variable to close the gap. Crucially, the adjustment accounts for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;P (Proportional)&lt;/strong&gt;: how far off you are &lt;em&gt;right now&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;I (Integral)&lt;/strong&gt;: how long you've been off (accumulated error)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;D (Derivative)&lt;/strong&gt;: whether you're getting closer or further away&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Applied to ETL chunking:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Control Theory&lt;/th&gt;
&lt;th&gt;ETL Pipeline&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Control variable&lt;/td&gt;
&lt;td&gt;chunk size&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Measured variable&lt;/td&gt;
&lt;td&gt;processing latency per chunk&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Setpoint&lt;/td&gt;
&lt;td&gt;target latency (e.g., 500ms)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PID output&lt;/td&gt;
&lt;td&gt;adjustment to chunk size&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If chunks are processing in 200ms and your target is 500ms → the system can handle larger chunks → PID increases chunk size. If chunks are taking 900ms → too slow → PID decreases chunk size.&lt;/p&gt;


&lt;h2&gt;
  
  
  StreamChunk: PID control for Python data pipelines
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# https://pypi.org/project/streamchunk/&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;streamchunk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Basic usage
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;streamchunk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StreamChunker&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FileSource&lt;/span&gt;

&lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FileSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;events.csv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;chunker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StreamChunker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;target_latency_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_memory_pct&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;min_chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_chunk_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;meta&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunker&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;chunker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;report_latency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunk_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elapsed_ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;That's the full loop. No tuning. &lt;code&gt;report_latency()&lt;/code&gt; feeds the actual processing time back into the PID controller, which computes the next chunk size. Convergence happens in &lt;strong&gt;5–10 iterations&lt;/strong&gt; — typically the first few seconds of a run.&lt;/p&gt;
&lt;h3&gt;
  
  
  The math (briefly)
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;error(t)  = target_latency_ms - actual_latency_ms
integral += error(t)
derivative = error(t) - error(t-1)

adjustment = kp*error(t) + ki*integral + kd*derivative
new_size   = clamp(current_size + adjustment, min_size, max_size)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Default gains: &lt;code&gt;kp=0.3&lt;/code&gt;, &lt;code&gt;ki=0.05&lt;/code&gt;, &lt;code&gt;kd=0.1&lt;/code&gt;. These are conservative by design — fast convergence without overshoot in bursty workloads.&lt;/p&gt;
&lt;h3&gt;
  
  
  Memory ceiling: the OOM killer
&lt;/h3&gt;

&lt;p&gt;The PID loop is smooth and gradual. But what if memory spikes suddenly? StreamChunk adds a hard ceiling that overrides PID output entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;mem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;psutil&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;virtual_memory&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;percent&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;mem&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;max_memory_pct&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;min_chunk_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;current_size&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# immediate halving
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This fires &lt;em&gt;every call&lt;/em&gt; until memory drops below threshold. Set &lt;code&gt;max_memory_pct=80&lt;/code&gt; and OOM crashes become essentially impossible.&lt;/p&gt;




&lt;h2&gt;
  
  
  Going parallel
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;ParallelStreamChunker&lt;/code&gt; distributes work across a worker pool — each worker gets its own dataset partition and its own independent PID controller:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;streamchunk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ParallelStreamChunker&lt;/span&gt;

&lt;span class="n"&gt;parallel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ParallelStreamChunker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;my_transform_func&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;# or "process"
&lt;/span&gt;    &lt;span class="n"&gt;n_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;target_latency_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_memory_pct&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parallel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parallel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Total rows: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;total_rows&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;p95 latency: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;p95_latency_ms&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Throughput: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rows_per_sec&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; rows/sec&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Thread vs Process: which one?
&lt;/h3&gt;

&lt;p&gt;StreamChunk auto-detects your CPU topology:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;streamchunk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;detect_cpu_threads&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;detect_cpu_threads&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="c1"&gt;# {'logical_threads': 16, 'physical_cores': 8,
#  'recommended_io': 16, 'recommended_cpu': 8}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;recommended_io&lt;/code&gt; vs &lt;code&gt;recommended_cpu&lt;/code&gt; distinction matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;mode="thread"&lt;/code&gt; + I/O-bound work&lt;/strong&gt; (Kafka, DB, API): Python's GIL releases during I/O waits. 16 threads = 16 concurrent I/O operations. &lt;strong&gt;12–14× speedup&lt;/strong&gt; on 8-core/16-thread.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;mode="process"&lt;/code&gt; + CPU-bound work&lt;/strong&gt; (transforms, ML inference): GIL bypassed entirely. Uses physical core count only to avoid hyperthreading overhead. &lt;strong&gt;6–7× speedup&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Data sources
&lt;/h2&gt;

&lt;p&gt;StreamChunk ships with five production sources and a clean extension API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Any iterable
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;streamchunk.sources&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GeneratorSource&lt;/span&gt;
&lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GeneratorSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10_000_000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# CSV / JSONL (memory-efficient, never loads full file)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;streamchunk.sources&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FileSource&lt;/span&gt;
&lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FileSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data.jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jsonl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Apache Kafka (unbounded stream)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;streamchunk.sources&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KafkaSource&lt;/span&gt;
&lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;KafkaSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;broker:9092&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;security_protocol&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SSL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ssl_cafile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/certs/ca.pem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Any SQLAlchemy DB (parameterized, safe)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;streamchunk.sources&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DatabaseSource&lt;/span&gt;
&lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DatabaseSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;postgresql://user:pass@host/db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM logs WHERE ts &amp;gt; :since&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;since&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cutoff&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Paginated REST API
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;streamchunk.sources&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;APISource&lt;/span&gt;
&lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;APISource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.example.com/records&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;page_param&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cursor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Building a custom source
&lt;/h3&gt;

&lt;p&gt;Only two methods required:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;streamchunk.sources&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseSource&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RedisStreamSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseSource&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xread&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stream&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_cursor&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                                &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;is_exhausted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Async support
&lt;/h2&gt;

&lt;p&gt;Drop-in async iteration for &lt;code&gt;asyncio&lt;/code&gt;-based pipelines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_async_pipeline&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;chunker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StreamChunker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_latency_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;meta&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;aiter&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;async_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;chunker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;report_latency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunk_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;elapsed_ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Prometheus + Grafana integration
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;streamchunk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;start_metrics_server&lt;/span&gt;

&lt;span class="nf"&gt;start_metrics_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# starts HTTP /metrics endpoint
&lt;/span&gt;
&lt;span class="c1"&gt;# Metrics exposed:
# streamchunk_rows_total (Counter)
# streamchunk_chunks_total (Counter)
# streamchunk_chunk_size_current (Gauge)
# streamchunk_memory_pct (Gauge)
# streamchunk_latency_ms (Histogram → p50/p95/p99 in Grafana)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  YAML config for deployment
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# streamchunk.yaml&lt;/span&gt;
&lt;span class="na"&gt;target_latency_ms&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;500&lt;/span&gt;
&lt;span class="na"&gt;max_memory_pct&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;span class="na"&gt;min_chunk_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;
&lt;span class="na"&gt;max_chunk_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50000&lt;/span&gt;
&lt;span class="na"&gt;initial_chunk_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1000&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;chunker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;StreamChunker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;streamchunk.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Useful when deploying across multiple environments (dev/staging/prod) with different hardware profiles.&lt;/p&gt;




&lt;h2&gt;
  
  
  Benchmark results
&lt;/h2&gt;

&lt;p&gt;Tested on 8-core / 16-thread cloud instance, 10 representative pipelines:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Single-threaded, no PID&lt;/td&gt;
&lt;td&gt;Baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Single-threaded + PID&lt;/td&gt;
&lt;td&gt;~99% of baseline (overhead negligible)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Thread mode, 16 workers&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;12–14× baseline&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Process mode, 8 workers&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6–7× baseline&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Pipeline-level impact:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;4 hours&lt;/td&gt;
&lt;td&gt;17–25 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CPU utilization&lt;/td&gt;
&lt;td&gt;12–20%&lt;/td&gt;
&lt;td&gt;85–95%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OOM crash rate&lt;/td&gt;
&lt;td&gt;~24%&lt;/td&gt;
&lt;td&gt;~0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manual tuning&lt;/td&gt;
&lt;td&gt;Required&lt;/td&gt;
&lt;td&gt;Eliminated&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;PID overhead per chunk: ~0.1–0.5ms (dominated by &lt;code&gt;psutil.virtual_memory()&lt;/code&gt; call, not the PID math itself).&lt;/p&gt;




&lt;h2&gt;
  
  
  Design patterns used
&lt;/h2&gt;

&lt;p&gt;This library leans on classical OOP patterns:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Where&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Iterator Protocol&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;StreamChunker.__iter__()&lt;/code&gt; is a standard Python iterator&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Observer&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;report_latency()&lt;/code&gt; feeds user observations back to controller&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Strategy&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;mode="thread"/"process"&lt;/code&gt; swaps executor at runtime&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Template Method&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;BaseSource&lt;/code&gt; ABC — subclasses implement &lt;code&gt;pull()&lt;/code&gt; and &lt;code&gt;is_exhausted()&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Factory Method&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;StreamChunker.from_config()&lt;/code&gt; — alternative YAML constructor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Facade&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;__init__.py&lt;/code&gt; exports — clean public API over complex internals&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DTO&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;ChunkMetadata&lt;/code&gt; dataclass — immutable per-chunk telemetry bundle&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Quick reference
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Install — https://pypi.org/project/streamchunk/
&lt;/span&gt;&lt;span class="n"&gt;pip&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="n"&gt;streamchunk&lt;/span&gt;
&lt;span class="n"&gt;pip&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;streamchunk[kafka,database,prometheus,pandas,async]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Single worker
&lt;/span&gt;&lt;span class="nc"&gt;StreamChunker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_latency_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_memory_pct&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Parallel
&lt;/span&gt;&lt;span class="nc"&gt;ParallelStreamChunker&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# From config
&lt;/span&gt;&lt;span class="n"&gt;StreamChunker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;config.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Async
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;meta&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;aiter&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="c1"&gt;# Metrics server
&lt;/span&gt;&lt;span class="nf"&gt;start_metrics_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# CPU info
&lt;/span&gt;&lt;span class="nf"&gt;detect_cpu_threads&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# → {logical, physical, recommended_io, recommended_cpu}
&lt;/span&gt;
&lt;span class="c1"&gt;# Stats
&lt;/span&gt;&lt;span class="n"&gt;chunker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# → {total_rows, total_chunks, avg_latency_ms, p95_latency_ms, ...}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;p&gt;Manual batch size tuning is a solved problem — it just hasn't been widely recognized as a control theory problem yet. PID control gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automatic adaptation to any machine, any dataset, any workload&lt;/li&gt;
&lt;li&gt;Hard memory ceiling that prevents OOM crashes&lt;/li&gt;
&lt;li&gt;12–14× throughput on I/O-bound work with zero extra configuration&lt;/li&gt;
&lt;li&gt;Prometheus metrics and async support out of the box&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;StreamChunk v2.0.1 · MIT License · Python 3.8–3.13 · &lt;a href="https://pypi.org/project/streamchunk/" rel="noopener noreferrer"&gt;📦 PyPI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you've ever woken up at 2 AM because a batch size was wrong, this one's for you.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Drop a comment if you're doing something interesting with ETL pipelines — always curious how others are handling this.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>dataengineering</category>
      <category>opensource</category>
      <category>etl</category>
    </item>
    <item>
      <title>Docker Commands Cheat Sheet</title>
      <dc:creator>Muhammad Sufiyan Baig</dc:creator>
      <pubDate>Tue, 04 Feb 2025 09:52:01 +0000</pubDate>
      <link>https://dev.to/muhammadsufiyanbaig/docker-commands-cheat-sheet-5cjp</link>
      <guid>https://dev.to/muhammadsufiyanbaig/docker-commands-cheat-sheet-5cjp</guid>
      <description>&lt;h1&gt;
  
  
  Mastering Docker: A Comprehensive Guide
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Docker has revolutionized the way developers build, ship, and run applications. It provides a lightweight, portable, and consistent environment to deploy applications seamlessly across different platforms. This guide covers essential Docker concepts and commands, making it easier for you to work with containers effectively.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Basic Docker Concepts&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is Docker?
&lt;/h3&gt;

&lt;p&gt;Docker is an open-source platform that automates application deployment using lightweight, portable containers. It enables developers to package applications along with their dependencies, ensuring consistency across different environments.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Container&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;A container is a self-sufficient executable unit that includes everything needed to run an application, such as code, runtime, libraries, and dependencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Image&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;An image is a read-only template containing an application and its dependencies. Containers are instantiated from images, allowing multiple containers to run from the same image.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Dockerfile&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;A Dockerfile is a script that contains a set of instructions to build a Docker image. It specifies the base image, dependencies, environment variables, and commands required to run an application.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Docker Compose&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Docker Compose is a tool used to define and manage multi-container applications. It uses a &lt;code&gt;docker-compose.yml&lt;/code&gt; file to configure services, networks, and volumes in a structured way.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Volume&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;A Docker volume is a persistent storage mechanism that allows containers to share and retain data beyond their lifecycle.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Network&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Docker networks provide a way for containers to communicate securely. Types include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bridge&lt;/strong&gt; (default) - Isolated networks for inter-container communication.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Host&lt;/strong&gt; - Shares the host network.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Overlay&lt;/strong&gt; - Used in Swarm mode for cross-host communication.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Container Management Commands&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start and stop containers&lt;/span&gt;
docker start &amp;lt;container_name&amp;gt;
docker stop &amp;lt;container_name&amp;gt;
docker restart &amp;lt;container_name&amp;gt;
docker &lt;span class="nb"&gt;rm&lt;/span&gt; &amp;lt;container_name&amp;gt;  &lt;span class="c"&gt;# Remove a container&lt;/span&gt;

&lt;span class="c"&gt;# List containers&lt;/span&gt;
docker ps           &lt;span class="c"&gt;# Running containers&lt;/span&gt;
docker ps &lt;span class="nt"&gt;-a&lt;/span&gt;        &lt;span class="c"&gt;# All containers (including stopped ones)&lt;/span&gt;
docker container &lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="c"&gt;# Alternative command&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Image Management Commands&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# View and remove images&lt;/span&gt;
docker images         &lt;span class="c"&gt;# List all images&lt;/span&gt;
docker image &lt;span class="nb"&gt;ls&lt;/span&gt;       &lt;span class="c"&gt;# Alternative command&lt;/span&gt;
docker rmi &amp;lt;image_id&amp;gt; &lt;span class="c"&gt;# Remove an image&lt;/span&gt;

&lt;span class="c"&gt;# Build, pull, and push images&lt;/span&gt;
docker build &lt;span class="nt"&gt;-t&lt;/span&gt; my-node-app &lt;span class="nb"&gt;.&lt;/span&gt;   &lt;span class="c"&gt;# Build an image from a Dockerfile&lt;/span&gt;
docker pull &amp;lt;image_name&amp;gt;        &lt;span class="c"&gt;# Download an image from Docker Hub&lt;/span&gt;
docker push &amp;lt;image_name&amp;gt;        &lt;span class="c"&gt;# Upload an image to Docker Hub&lt;/span&gt;
docker tag &amp;lt;image_id&amp;gt; &amp;lt;repo&amp;gt;:&amp;lt;tag&amp;gt; &lt;span class="c"&gt;# Tag an image for pushing&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Running Containers&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run containers interactively&lt;/span&gt;
docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &amp;lt;image_name&amp;gt;
docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; 8000:8000 docker-app-1  &lt;span class="c"&gt;# Port mapping&lt;/span&gt;

docker run &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; &amp;lt;exposing_port:internal_port&amp;gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &amp;lt;&lt;span class="nv"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;value&amp;gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &amp;lt;&lt;span class="nv"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;value&amp;gt; &amp;lt;image_name&amp;gt; &lt;span class="c"&gt;# Pass environment variables&lt;/span&gt;

docker run &lt;span class="nt"&gt;--name&lt;/span&gt; &amp;lt;container_name&amp;gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &amp;lt;image_name&amp;gt; &lt;span class="c"&gt;# Run in detached mode&lt;/span&gt;
docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &amp;lt;image_name&amp;gt; &lt;span class="c"&gt;# Remove container after stopping&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Executing Commands Inside Containers&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &amp;lt;container_name&amp;gt; &amp;lt;&lt;span class="nb"&gt;command&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="c"&gt;# Execute a command&lt;/span&gt;
docker &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &amp;lt;container_name&amp;gt; bash      &lt;span class="c"&gt;# Open a Bash shell&lt;/span&gt;
docker attach &amp;lt;container_name&amp;gt;             &lt;span class="c"&gt;# Attach to a running container&lt;/span&gt;
docker logs &amp;lt;container_name&amp;gt;                &lt;span class="c"&gt;# View logs&lt;/span&gt;
docker logs &lt;span class="nt"&gt;-f&lt;/span&gt; &amp;lt;container_name&amp;gt;             &lt;span class="c"&gt;# Follow logs in real time&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Networking in Docker&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# List, create, and remove networks&lt;/span&gt;
docker network &lt;span class="nb"&gt;ls
&lt;/span&gt;docker network create &amp;lt;network_name&amp;gt;
docker network inspect &amp;lt;network_name&amp;gt;
docker network connect &amp;lt;network_name&amp;gt; &amp;lt;container_name&amp;gt;
docker network disconnect &amp;lt;network_name&amp;gt; &amp;lt;container_name&amp;gt;
docker network &lt;span class="nb"&gt;rm&lt;/span&gt; &amp;lt;network_name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Managing Volumes&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# View and create volumes&lt;/span&gt;
docker volume &lt;span class="nb"&gt;ls
&lt;/span&gt;docker volume create &amp;lt;volume_name&amp;gt;
docker volume inspect &amp;lt;volume_name&amp;gt;

docker volume &lt;span class="nb"&gt;rm&lt;/span&gt; &amp;lt;volume_name&amp;gt;  &lt;span class="c"&gt;# Remove a volume&lt;/span&gt;

docker run &lt;span class="nt"&gt;-v&lt;/span&gt; &amp;lt;volume_name&amp;gt;:&amp;lt;container_path&amp;gt; &amp;lt;image_name&amp;gt; &lt;span class="c"&gt;# Mount a volume&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Docker Compose Commands&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker-compose up       &lt;span class="c"&gt;# Start containers&lt;/span&gt;
docker-compose up &lt;span class="nt"&gt;-d&lt;/span&gt;    &lt;span class="c"&gt;# Start in detached mode&lt;/span&gt;
docker-compose down     &lt;span class="c"&gt;# Stop and remove containers&lt;/span&gt;
docker-compose up &lt;span class="nt"&gt;--build&lt;/span&gt;  &lt;span class="c"&gt;# Rebuild and restart containers&lt;/span&gt;
docker-compose logs     &lt;span class="c"&gt;# View logs&lt;/span&gt;
docker-compose ps       &lt;span class="c"&gt;# List running services&lt;/span&gt;
docker-compose &lt;span class="nb"&gt;exec&lt;/span&gt; &amp;lt;service_name&amp;gt; &amp;lt;&lt;span class="nb"&gt;command&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;  &lt;span class="c"&gt;# Run command in a service&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Cleaning Up Docker Resources&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Remove unused containers, networks, and images&lt;/span&gt;
docker system prune  &lt;span class="c"&gt;# Remove unused resources&lt;/span&gt;
docker system prune &lt;span class="nt"&gt;-a&lt;/span&gt;  &lt;span class="c"&gt;# Remove all unused images, containers, networks&lt;/span&gt;

docker container prune  &lt;span class="c"&gt;# Remove all stopped containers&lt;/span&gt;
docker volume prune  &lt;span class="c"&gt;# Remove all unused volumes&lt;/span&gt;
docker network prune  &lt;span class="c"&gt;# Remove all unused networks&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Dockerfile Essentials&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# Define base image&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; node:16-alpine&lt;/span&gt;

&lt;span class="c"&gt;# Set working directory&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;

&lt;span class="c"&gt;# Copy application files&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;

&lt;span class="c"&gt;# Install dependencies&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt;

&lt;span class="c"&gt;# Set environment variables&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; PORT=8000&lt;/span&gt;

&lt;span class="c"&gt;# Expose port&lt;/span&gt;
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 8000&lt;/span&gt;

&lt;span class="c"&gt;# Define the command to run the application&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["node", "index.js"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Docker is an essential tool for modern application deployment, simplifying containerized environments for scalable and efficient workflows. This guide serves as a handy reference to help you master Docker commands and concepts quickly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Have any questions or suggestions? Drop them in the comments below! 🚀&lt;/strong&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Docker for Beginners: Containerizing Your First Application</title>
      <dc:creator>Muhammad Sufiyan Baig</dc:creator>
      <pubDate>Thu, 30 Jan 2025 09:03:16 +0000</pubDate>
      <link>https://dev.to/muhammadsufiyanbaig/docker-for-beginners-containerizing-your-first-application-4ki7</link>
      <guid>https://dev.to/muhammadsufiyanbaig/docker-for-beginners-containerizing-your-first-application-4ki7</guid>
      <description>&lt;h2&gt;
  
  
  What is Docker?
&lt;/h2&gt;

&lt;p&gt;Docker is a platform that enables users to package applications into lightweight, independent containers known as Docker containers. Unlike traditional virtual machines (VMs), Docker containers share the host system’s kernel, ensuring better speed, resource efficiency, and consistency across different environments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before you begin, ensure you have the following installed:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Docker Desktop&lt;/strong&gt; (for macOS/Windows) or &lt;strong&gt;Docker Engine&lt;/strong&gt; (for Linux). Install it from &lt;a href="https://www.docker.com/products/docker-desktop/" rel="noopener noreferrer"&gt;Docker's official website&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;A simple application. In this guide, we will use a basic Node.js "Hello World" server.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Step 1: Create a Sample Application
&lt;/h2&gt;

&lt;p&gt;Create a basic Node.js app with the following files:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. &lt;code&gt;package.json&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"docker-demo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"main"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"server.js"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scripts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"start"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"node server.js"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dependencies"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"express"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^4.18.2"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. &lt;code&gt;server.js&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;  
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;  
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Hello from Docker!&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  
&lt;span class="p"&gt;});&lt;/span&gt;  

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;port&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;  
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`App running on http://localhost:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;port&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  
&lt;span class="p"&gt;});&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 2: Write a Dockerfile
&lt;/h2&gt;

&lt;p&gt;Create a &lt;code&gt;Dockerfile&lt;/code&gt; in your project’s root directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; node:18-alpine  &lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app  &lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; package*.json ./  &lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt;  

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .  &lt;/span&gt;

&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 3000  &lt;/span&gt;

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["npm", "start"]  &lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Explanation:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;FROM&lt;/code&gt;&lt;/strong&gt;: Uses Node.js (Alpine Linux variant) as the base image.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;WORKDIR&lt;/code&gt;&lt;/strong&gt;: Sets the working directory inside the container.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;COPY&lt;/code&gt;&lt;/strong&gt;: Copies necessary files into the container.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;RUN&lt;/code&gt;&lt;/strong&gt;: Installs dependencies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;EXPOSE&lt;/code&gt;&lt;/strong&gt;: Declares the port but does not automatically map it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;CMD&lt;/code&gt;&lt;/strong&gt;: Defines the command to start the application.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 3: Build the Docker Image
&lt;/h2&gt;

&lt;p&gt;Navigate to your project directory and run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker build &lt;span class="nt"&gt;-t&lt;/span&gt; my-first-docker-app &lt;span class="nb"&gt;.&lt;/span&gt;  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-t my-first-docker-app&lt;/code&gt;&lt;/strong&gt;: Tags the image with a name.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;.&lt;/code&gt;&lt;/strong&gt;: Specifies the current directory as the build context.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Step 4: Run the Container
&lt;/h2&gt;

&lt;p&gt;Start the container with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-p&lt;/span&gt; 3000:3000 my-first-docker-app  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;-p 3000:3000&lt;/code&gt;&lt;/strong&gt;: Maps port 3000 of your local machine to port 3000 inside the container.&lt;/li&gt;
&lt;li&gt;Open &lt;code&gt;http://localhost:3000&lt;/code&gt; in your browser to see the message: &lt;strong&gt;“Hello from Docker!”&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Best Practices for Beginners
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use a &lt;code&gt;.dockerignore&lt;/code&gt; file&lt;/strong&gt;: Prevent unnecessary files (e.g., &lt;code&gt;node_modules&lt;/code&gt;, &lt;code&gt;.env&lt;/code&gt;) from being copied.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimize Image Size&lt;/strong&gt;: Prefer smaller base images like &lt;code&gt;alpine&lt;/code&gt; variants.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoid Running as Root&lt;/strong&gt;: Add a non-root user for better security.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clean Up Unused Images/Containers&lt;/strong&gt;:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  docker &lt;span class="nb"&gt;rm&lt;/span&gt; &amp;lt;container_id&amp;gt;  &lt;span class="c"&gt;# Remove stopped containers&lt;/span&gt;
  docker rmi &amp;lt;image_id&amp;gt;      &lt;span class="c"&gt;# Remove unused images&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;You have successfully containerized your first application using Docker. Docker simplifies dependency management and ensures application consistency across different environments.&lt;/p&gt;

&lt;p&gt;To further explore Docker, try experimenting with Docker Compose for multi-container applications or explore the vast Docker ecosystem.&lt;/p&gt;

&lt;p&gt;Happy Containerizing! 🐳&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
