<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: EricKaranja17</title>
    <description>The latest articles on DEV Community by EricKaranja17 (@erickaranja17).</description>
    <link>https://dev.to/erickaranja17</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3239781%2F63da701a-3d5c-4b86-8480-96f768e64990.png</url>
      <title>DEV Community: EricKaranja17</title>
      <link>https://dev.to/erickaranja17</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/erickaranja17"/>
    <language>en</language>
    <item>
      <title>Getting Started with Docker and Docker Compose: A Beginner’s Guide</title>
      <dc:creator>EricKaranja17</dc:creator>
      <pubDate>Mon, 25 Aug 2025 14:03:25 +0000</pubDate>
      <link>https://dev.to/erickaranja17/getting-started-with-docker-and-docker-compose-a-beginners-guide-5dho</link>
      <guid>https://dev.to/erickaranja17/getting-started-with-docker-and-docker-compose-a-beginners-guide-5dho</guid>
      <description>&lt;p&gt;Getting started with Docker doesn't have to feel overwhelming. Think of Docker as a tool that helps you "package once and run anywhere," making your applications more portable and reliable.&lt;/p&gt;

&lt;p&gt;This beginner guide aims to provide an introduction to &lt;br&gt;
docker and Docker Compose. I assume that you have docker on your Windows machine, if not, check this beautiful &lt;a href="https://dev.to/joy_akinyi_115689d7dff92f/getting-started-with-docker-and-docker-compose-a-beginners-guide-5g42"&gt;article&lt;/a&gt; by Joy Akinyi. &lt;/p&gt;

&lt;p&gt;In this article i will provide a brief history,pro/cons and definitions of Docker and Docker Compose.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Docker ?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Docker is an open source platform that enables developers manage applications within containers. Containers are lightweight, isolated environment that package an application with its dependencies, providing consistent and reproducible deployments across different computing environments that have Docker installed. Docker allows developers to build, ship and run applications efficiently, regardless of the underlying infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Docker Compose ?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Docker compose is a tool that allows you to define and run multi-container applications. With Docker Compose, you can create a YAML file which we will see ahead, that defines the services that make your application, and then use a single command to start all services.&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;docker-compose up&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker and Docker Compose: A Brief History&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Due to my deep passion for history lets briefly narrate the conception of Docker and Docker Compose,&lt;/p&gt;

&lt;p&gt;Docker was first released in March 2013 by Solomon Hykes and his team at a company called dotCloud later renamed into Docker, Inc. In 2010 dotCloud was experimenting with Linux Containers (LXC) tech. They solved the teeth shattering statement "but it works in my machine" because applications behaved differently across environments.In March 2013 Docker was officialy released as an open source project at PyCon. In June 2013 its first version is released evolving to what it is today.&lt;/p&gt;

&lt;p&gt;Docker Compose was introduced in 2014 as a separate project initially called “fig”, later becoming an official Docker tool. It was created to address the need for managing complex applications with multiple interconnected containers. Docker Compose emerged as a solution for orchestrating container-based applications, allowing developers to define and manage their dependencies easily.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker Key Terminologies&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Image:&lt;/strong&gt; A lightweight, standalone, executable package that contains everything needed to run an application, including the code, runtime, libraries and system tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Container&lt;/strong&gt;: A container is a runnable instance of an image on the host machine. It is isolated from other containers and the host machine, and has its own filesystem, network process space.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dockerfile:&lt;/strong&gt; A docker file is a text file that contains the instructions for building a Docker image.&lt;br&gt;
&lt;strong&gt;Registry:&lt;/strong&gt; A centralized repository that stores Docker images, such as Docker Hub or private registries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker Hub:&lt;/strong&gt; This is the official registry for Docker images.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Volume:&lt;/strong&gt; A persistent storage mechanism that allows data to be shared between containers or between a container and the host system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker Compose Key Terminologies&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Service:&lt;/strong&gt; A definition that describes how to run a specific container, including its image, configuration, dependencies and networking requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;YAML Ain’t Markup Language (YAML):&lt;/strong&gt; A human-readable data serialization format used by Docker Compose to define the application’s structure and configuration. I briefly talked about YAML files in my previous article, Getting Started with GitHub Actions: A Beginner’s Guide.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Volume:&lt;/strong&gt; A mechanism to persist data across container restarts or between containers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Network:&lt;/strong&gt; A virtual network that allows containers within the same Docker Compose project to communicate with each other. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker Hub&lt;/strong&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Fight Night:Batch vs Stream Processing.</title>
      <dc:creator>EricKaranja17</dc:creator>
      <pubDate>Mon, 11 Aug 2025 13:05:25 +0000</pubDate>
      <link>https://dev.to/erickaranja17/fight-nightbatch-vs-stream-processing-5hnm</link>
      <guid>https://dev.to/erickaranja17/fight-nightbatch-vs-stream-processing-5hnm</guid>
      <description>&lt;p&gt;Data-driven decision-making has become the heart of today's world, organizations generate information at an unprecedented pace from daily sales transactions to real time sensor reading in IoT&lt;a href="https://dev.to(Internet%20of%20Things)"&gt;https://www.oracle.com/africa/internet-of-things/&lt;/a&gt; devices.&lt;br&gt;
But if this data are not collected,mined and processed properly then it has no value. This depicts the importance of data processing.&lt;/p&gt;

&lt;p&gt;In this article, i will explain the types if data processing,use cases and differences.&lt;/p&gt;

&lt;p&gt;It begins.&lt;/p&gt;

&lt;p&gt;There are two ways in which data are processed:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Batch Processing.&lt;/li&gt;
&lt;li&gt;The Stream Processing.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;What is Batch Data Processing?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Batch processing is when computer handle high&lt;br&gt;
-volumes and repetitive tasks by grouping data into batches and processing it. &lt;/p&gt;

&lt;p&gt;Batch processing is mainly automated, with minimal human interaction. tasks are predefined, and the system executes them according to a scheduled timeline.&lt;/p&gt;

&lt;p&gt;There are a variety of ETL tools for batch processing. A common tool is Apache Airflow, which allows users to quickly build up data orchestration pipelines that can run on a set schedule and have simple monitoring. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: Data orchestration is the automated process of managing and coordinating the flow of data across various systems, applications, and platforms.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Stream Data Processing?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Also called real-time data processing, I personally believe that its alias is self explanatory, regardless let me delineate it.&lt;/p&gt;

&lt;p&gt;Stream processing continuously ingests and analyzes data. Instead of waiting for data to accumulate, you can process it instantly. This is crucial for time-critical analysis.&lt;/p&gt;

&lt;p&gt;One popular framework is Apache Kafka. Apache Kafka is a distributed, fault-tolerant, scalable, and high-throughput messaging system. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp8h9w26jr0q1f24offv6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp8h9w26jr0q1f24offv6.png" alt=" " width="800" height="258"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stream Processing Use Cases&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Stream processing is particularly beneficial in several key areas. Here are 4 prime examples:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Fraud detection: Stream processing allows financial institutions to monitor transactions in real time. This helps identify and flag suspicious activities immediately, which helps in preventing fraud effectively.
Network Monitoring: In network management, stream processing enables you to constantly monitor your network traffic. This real-time analysis helps in quickly detecting and addressing any anomalies or issues, ensuring smooth network operations.
Predictive Maintenance: Industries use stream processing to monitor equipment health in real time. As a result, potential issues can be detected and addressed before they lead to equipment failure, which saves costs and improves efficiency.
Intrusion Detection: In cybersecurity, stream processing helps in real-time detection of unauthorized access or activities within a network. The detection allows for swift action to mitigate potential security threats.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Batch Processing Use Cases&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You should use batch processing in scenarios where data processing must be scheduled and does not require immediate results. The 3 best examples include:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;End-of-day reporting: Financial institutions often use batch processing for end-of-day reports. Transactions and activities are accumulated throughout the day and processed in one go, generating comprehensive reports for analysis.
Data warehousing: Organizations use batch processing to update data warehouses periodically. Large volumes of data are collected and processed in batches, ensuring that the data warehouse is up-to-date with the latest information for analytical purposes.
Payroll processing: Companies process payroll data in batches, typically on a bi-weekly or monthly basis. This involves collecting timekeeping data, calculating salaries, and generating paychecks, all done in bulk to streamline operations.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qt7weml6p6jjwob8pur.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qt7weml6p6jjwob8pur.gif" alt=" " width="480" height="270"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Difference between Batch and Stream Processing.&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Data Size and Scope: Batch processing is best for handling large volume of data. it can process all the data while Stream processing is ideal for low volumes of data. It handles a process real time data eg: Mpesa Transactions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Performance: Looking at Data Latency, the time taken from when data is collected and made available for processing, Stream processing takes a few milliseconds while batch processing takes hours, days, weeks, months, years.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Required Hardware: Sequential data handling in batch processing requires more resources and storage system to handle the large data ingestion. Continuous data handling in stream processing requires less resources data is processed real-time with no need for storage for later processing.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Note: Data ingestion is the process of collecting, importing, and loading data from various sources into a system for storage and analysis&lt;/em&gt;&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>dataprocessing</category>
    </item>
    <item>
      <title>Functions; what is it to python ?</title>
      <dc:creator>EricKaranja17</dc:creator>
      <pubDate>Mon, 28 Jul 2025 14:44:28 +0000</pubDate>
      <link>https://dev.to/erickaranja17/functions-what-is-it-to-python--1ilj</link>
      <guid>https://dev.to/erickaranja17/functions-what-is-it-to-python--1ilj</guid>
      <description>&lt;p&gt;Python being one of the most popular programming language due to its relatively ease to use, while still extremely versatile and powerful. It is the go to object-oriented programing language for data gurus. If you are burning to learn Python, then understanding how to write functions is a good starting point.&lt;/p&gt;

&lt;p&gt;In the context of programming, a function is a named sequence of statements that performs a computation. When you define a function, you specify the name of the sequence of statements. Later, you can 'call' the function by name.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Function calls&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;marks= 10.2,100.10,16.17
type(marks)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The name of the function is type. The expression in parentheses is called the argument of the function.&lt;br&gt;
A function takes an argument and returns a return value/ result.&lt;/p&gt;

&lt;p&gt;A Python functions consists of three components.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A def statement defining a function
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;def function_name(parameter1, parameter2):&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Body: The block of code that executes when the function is called. &lt;/li&gt;
&lt;li&gt;A return statement: used to send a value back from the function to the caller.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Below are various examples of the syntax above:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def add(a, b):
        result = a + b
        return result
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def my_function():
  print("Hello from a function")

my_function()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Adding New Functions&lt;/strong&gt;&lt;br&gt;
A &lt;em&gt;function definition&lt;/em&gt; specifies the name of a new function and the sequence of statements that run when the function is called:&lt;br&gt;
Eg:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def lyrics():
        print("Back Bencher ndani ya S-Class.")
lyrics()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;def&lt;/em&gt; is a keyword that indicates that this is a function definition. The name of the function is lyrics. The rules of function names are the same as for variable names. Here is a synopsis of those rules: letters,number and underscores are legal, but the first character can't be a number. Keywords cannot be used as a name of a function. Avoid having a variable and a function with the same name.&lt;br&gt;
The def statement should end with a colon and the body has to be indented. Indentation is always four spaces.&lt;/p&gt;

&lt;p&gt;The syntax for calling the new function is the same as for built-in functions. Once you have defined a function, you can use it inside another function.&lt;br&gt;
For example, to repeat the previous refrain, we can write a function called repeat_lyrics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def lyrics():
    print("Back bencher ndani ya S-class")

def repeat_lyrics():
    lyrics()

repeat_lyrics()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The program contains two functions definitions:lyrics and repeat_lyrics. Function definitions get executed just like other statements, but the effect is to create function objects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why functions?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let me present to you a few reasons why functions are crucial in your program.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creating a new function gives you an opportunity to name a group of statements,which makes your program easier to read and debug.&lt;/li&gt;
&lt;li&gt;Functions can make a program smaller by eliminating repetitive code.Later, if you make a change, you only have to make it in one place.&lt;/li&gt;
&lt;li&gt;Dividing a long program into function allows you to debug the parts one at a time and then assemble them into a working whole.&lt;/li&gt;
&lt;li&gt;Well-designed functions are often useful for many programs. Once you write and debug one, you can reuse it. &lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>programming</category>
      <category>python</category>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>EricKaranja17</dc:creator>
      <pubDate>Sat, 26 Jul 2025 11:32:28 +0000</pubDate>
      <link>https://dev.to/erickaranja17/functions-2fa3</link>
      <guid>https://dev.to/erickaranja17/functions-2fa3</guid>
      <description></description>
      <category>discuss</category>
    </item>
    <item>
      <title>Benefits of OLAP and OLTP in Data Management.</title>
      <dc:creator>EricKaranja17</dc:creator>
      <pubDate>Fri, 25 Jul 2025 17:09:06 +0000</pubDate>
      <link>https://dev.to/erickaranja17/benefits-of-olap-and-oltp-in-data-management-26cl</link>
      <guid>https://dev.to/erickaranja17/benefits-of-olap-and-oltp-in-data-management-26cl</guid>
      <description>&lt;p&gt;Explain how the separation of OLTP and OLAP systems benefits overall organizational data management strategy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First lets understand what is OLTP and OLAP:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;OLTP(Online Transaction Processing): is a type of data processing that consists of executing a number of transactions occurring concurrently.For example online banking, shopping,order entry etc.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Definition according to IBM:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;OLTP: enables the real-time execution of large numbers of database transactions by large numbers of people, typically over the internet.&lt;/p&gt;

&lt;p&gt;In lay man terms, it processes real time data.&lt;/p&gt;

&lt;p&gt;NB:&lt;em&gt;A database transaction is a change, insertion, deletion, or query of data in a database.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;OLAP (Online Analytical Processing): a type of data processing that involves numerous real-time transactions executed concurrently by many users.&lt;/p&gt;

&lt;p&gt;In lay man terms, it process historical data most for analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Understanding the benefits of OLAP and OLTP systems separation on data management strategy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now, that we have a basic comprehension of the systems we can now delve into their various individual prowess:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data formatting: 
OLAP systems use multidimensional data models; in cube format, thus you can view the same data from different angles. Enabling it to handle complex queries, facilitating in-depth analysis for decision making.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OLTP systems uses uni-dimensional data models that organizes data into tables on one data aspect. Ensuring  data accuracy and consistency, crucial for maintaining the reliability of organizational operations&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data Architecture: 
OLAP architecture prioritizes data read over data write operations. With this capability it is able to support reporting,data mining and other analytical tasks.
Availability is a low priority concern ad the primary use case is analytics.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OLTP architecture prioritizes data write operations which helps it to handle simple, frequent transactions efficiently.&lt;br&gt;
Availability is a high priority due to its significance in daily business operations. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Performance:
OLAP processing times can vary from minutes, hours to days depending on the type and volume of data. Data updates in OLAP databases utilize &lt;em&gt;batch processing&lt;/em&gt; , where by you periodically process data in large batches then upload the batch to the system all at once. Crucial for integrating data from multiple sources, providing a comprehensive view of the organization.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OLTP processing times in milliseconds or less updates are on real time and are initiated by you or your users. &lt;em&gt;Stream processing&lt;/em&gt; is the most used where high volume data moves in a continuous incremental manner with the goal of low-latency processing.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>dataengineering</category>
    </item>
  </channel>
</rss>
