DEV Community: Austin Oketch

A Beginner's Guide to Python,APIs and Pandas

Austin Oketch — Tue, 12 Aug 2025 05:34:10 +0000

Ingesting,processing and analyzing data from external sources has become common in Software Development.This is mainly achieved via the use of APIs.In this guide we'll talk about the process of building a basic but powerful data ingestion script using Python.

We'll be obtaining cryptocurrency pair prices from the publicly available Binance API and filter only responses we're interested in.This will be achieved by utilizing two popular Python libraries: pandas for data manipulation and requests for handling HTTP communication.

The Complete Python Script

import pandas as pd
import requests

# Configuration: Define constants for maintainability
BASE_URL = 'https://api.binance.com'
TARGET_PAIRS = ['BTCUSDT','ETHBTC','ETHUSDT','SOLUSDT']

def get_latest_prices():
    """
    Fetches, parses, and filters price data from the Binance API.
    """
    # 1. Construct the full API endpoint URL
    endpoint = f'{BASE_URL}/api/v3/ticker/price'

    # 2. Execute the HTTP GET request
    response = requests.get(endpoint)
    # For production, add error handling: response.raise_for_status()

    # 3. Deserialize the JSON response into a Python object
    data = response.json()

    # 4. Load the raw data into a pandas DataFrame
    price_df = pd.DataFrame(data)

    # 5. Filter the DataFrame using boolean masking
    filtered_df = price_df[price_df['symbol'].isin(TARGET_PAIRS)]

    print(filtered_df)
    return filtered_df

# 6. Define the script's entry point
if __name__ == "__main__":
    get_latest_prices()

Now lets break down the script step by step.

1. Configuration and Structure

BASE_URL = 'https://api.binance.com'
TARGET_PAIRS = ['BTCUSDT','ETHBTC','ETHUSDT','SOLUSDT']

We define BASE_URL and TARGET_PAIRS at the top to separate configuration and logic making code easier to read and update without touching core function.

2. Interface with API using requests

The requests library is utilized to handle HTTP transaction with the Binance API.

endpoint = f'{BASE_URL}/api/v3/ticker/price'
response = requests.get(endpoint)

The requests.get() packages the server response containing status codes, headers and data payload into a single response object stored in response variable.

3. Deserializing the JSON Payload

The response(payload) from most modern APIs is formatted as JSON(JavaScript Object Notation).

Even though this is text-based and readable it is not a Python native object.It therefore requires to be parsed and "deserialized".

data = response.json()

4. Structuring Data with pandas

The data variable is now a large list of dictionaries.
We use pandas to transform this raw data into optimized, tabular structure called a DataFrame.

price_df = pd.DataFrame(data)

A DataFrame is an in-memory two-dimensional table with labeled axes(rows and columns).

5. Filtering with Boolean Masking

filtered_df = price_df[price_df['symbol'].isin(TARGET_PAIRS)]

price_df['symbol']: First you select the symbol column of the DataFrame which returns a pandas *Series * object which is a single column of the DataFrame.
.isin(TARGET_PAIRS): You then call the .isin() method on this Series.The method performs a fast element check, returning a series of Boolean values.True means symbol in that row exists in TARGET_PAIRS list and vice versa.
price_df[...]: Finally you use this boolean Series as a mask to index the original price_df.

This is known as boolean masking.It evaluates the mask and returns a new DataFrame containing only rows where the mask value is True.

We have now successfully completed implementing a data ingestion pipeline.You can later add Robust Error handling and Automation e.g., using scheduler like cron to create a historical price log.

Decoding Data Engineering: 15 Core Concepts That Power Modern Data Platforms

Austin Oketch — Mon, 11 Aug 2025 06:04:32 +0000

Data Engineering is the invisible backbone to the data-driven space.Even though Data Scientists and Data Analysts get the spotlight for the insightful dashboards and predictive models, data engineers are key architects of the data-driven ecosystem via the provision of robust and reliable data pipelines.

This Architecture consists of a collection of powerful concepts which when combined together in turning raw data into purposeful and insightful information.

Some of these concepts include:

Chapter 1: How Data is Extracted and Transformed

Data Engineering is basically about moving data from a source to a destination.

1. Stream and Batch Ingestion.

Both of these are fundamental steps in data ingestion.

Stream Ingestion is processing of data event by event in near real time as it is generated.This is very important for time sensitive systems and applications like a real-time analytics dashboard, monitoring system logs and fraud detection.

Batch Ingestion is the gathering, processing and moving data in large scheduled chunks at a go.This is highly efficient for large volumes of data and application that don't require immediate data like generating reports.

2. ELT and ETL

ELT(Extract, Load, Transform) is the modern cloud-native approach.You extract data and immediately load it in a scalable cloud data warehouse.The transformation logic is then done in the warehouse via tools like SQL.

ETL(Extract,Transform,Load) is the traditional approach.You extracted data from a source, transforming it into a clean, structured dataset on a separate processing server e.g, Spark Cluster and loading into into a data warehouse.

3. Change Data Capture(CDC)

Change Data Capture allows you to identify only the changes (insert, update, and delete operations) rather than re-ingesting a table with many rows merely to obtain a handful that changed.

It does this by scanning write-ahead logs, or database transaction logs, and utilizing Debezium and other CDC tools to record operational changes. Because of this, the entire pipeline is effective in allowing OLTP databases and OLAP systems to synchronize in almost real-time.

Chapter 2: Data Storage

4.OLAP and OLTP
These many database system types were created with various goals in mind.

OLTP (Online Transaction Processing) systems, such as ATMs and aircraft reservation systems, are designed to process large amounts of reads and writes in a fast and reliable manner.

Systems designed for complicated queries over vast amounts of historical data, such as a dashboard displaying sales trends over time, are known as OLAP (Online Analytical Processing).

5. Columar and Row-based Storage

Columnar-based Storage stores all values in a single column.This makes processing analytical queries incredibly fast(used by OLAP systems like Snowflake,Redshift).

Row-based Storage stores values belonging to a single record together in rows.(used by OLTP databases like PostgreSQL,MariaDB).

6.Partitioning

This is the splitting of large tables into smaller more manageable physical chunks based on a key.This is done to improve performance,scalability and manageability of a database.

Chapter 3: Building Robust and Reliable Data Pipelines

7. Idempotency

An operation is idempotent if it has same outcome after being ran once.e.g, a system that processes payment transactions.

If it fails during the processing and restarts other payments can be reprocessed.An idempotent design ensures one is not charged two times by checking if the transaction ID exists before inserting a new record.

Retry Logic and Dead Letter Queues(DLQs)
Dead Letter Queues (DLQs) and Retry Logic

Retry Logic is a software fallback mechanism that retries an operation once a brief error occurs in order to handle operational failures.

In order to avoid unsuccessful messages from obstructing the queue and enabling the examination of unsuccessful messages, the Dead Letter Queue serves as a holding space for messages that cannot be delivered to recipients or processed successfully.

9. Backfilling and Reprocessing

Backfilling refers to processing historical data through a pipeline to either append or update pipeline with information.

Reprocessing is the process of running a pipeline or a portion of it again in order to add new data, apply changes, or fix mistakes.

Chapter 4: Workflow Orchestration and Stream Processing

DAGs and Workflow Orchestration

The data pipeline's blueprint is called a DAG (Directed Acyclic Graph).It specifies dependencies (directed edges) and tasks (nodes), making sure they execute in the right order and avoid becoming trapped in acyclic (infinite) loops, like Apache Airflow.

11. Windowing in Streaming
This allows computations on specified intervals by dividing a continuous data stream into manageable pieces known as windows.

Common types include:
Tumbling Window: Fixed-size, non-overlapping windows.
Sliding Window: Fixed-size, overlapping windows.

Chapter 5: Advanced System Design and Governance

12. CAP Theorem

This theorem states that a distributed data store can only offer two of the following guarantees, according to a fundamental law of distributed systems:
Availability: Every database request is answered, even if it isn't the most recent data.
Partition Tolerance: The system keeps running even when there are network partitions (communication breakdowns between nodes).
Consistency: Every node in the system sees the same data at the same time.

13. Data Governance
This is the framework of policies, rules and standards for managing data.

It is involved in:

Data Quality: Data completeness and accuracy.
Data Lineage: Data origin.
Access Control: Entities allowed to view and utilize data.

14. Time Travel and Data Versioning

These are techniques that allows users to access and restore previous states of data facilitating historical analysis and error recovery.

15. Distributed Processing Concepts

This is the processing of dividing a computational task among multiple computers(nodes) in a network allowing handling of large complex problems by utilizing combined power of multiple machines.

Data Platform for Analyzing Kenya’s Food Prices and Inflation Trends.

Austin Oketch — Sun, 10 Aug 2025 10:34:10 +0000

In this article we will be turning a raw dataset containing Kenya food prices in different regions across the country over the years.We'll achieve this via an Extraction, Transformation and Loading pipeline building a complete data pipeline for analysis.

We'll go from a CSV file to a clean, query-optimized database and a beautiful, interactive Grafana dashboard.

PROJECT ARCHITECTURE

Our data pipeline for analysis will look like this:

Source: A World Food Programme CSV file containing Kenya Food Prices in different Kenyan provinces and regions since 2006.
ETL Engine: A Python script utilizing Pandas for the Extraction, Transformation and Loading of the extracted data into designated database for analysis.
Data Warehouse: A PostgreSQL database modeled with a Star Schema
Visualization Layer: A Grafana dashboard that queries the database to get insights.

We use a Star Schema which contains a central fact table that has quantitative elements like price surrounded by dimensional tables that have descriptive attributes like location and commodity.

Step 1: Python Environment and Database Configuration

1.Setup Python Virtual Environment

Create a virtual environment and install necessary libraries.

python -m venv venv
source venv/bin/activate # For macOS/Linux
# .\venv\Scripts\activate # For Windows

# Create a requirements.txt file
touch requirements.txt

Add the following to requirements.txt:

pandas
SQLAlchemy
psycopg2-binary
python-dotenv

Install them:

pip install -r requirements.txt

2.Create the Database and Tables
In your PostgreSQL instance, create a database.

CREATE DATABASE kenya_food_prices;

Connect to the created database and use this SQL script to set up a star schema.
UNIQUE constraints in the SQL scripts are key to prevent duplicating dimension data.

CREATE TABLE dim_date (
    date_id SERIAL PRIMARY KEY,
    date_value DATE NOT NULL UNIQUE,
    year INTEGER NOT NULL,
    month INTEGER NOT NULL,
    day INTEGER NOT NULL
);

CREATE TABLE dim_location (
    location_id SERIAL PRIMARY KEY,
    admin1 VARCHAR(255),
    admin2 VARCHAR(255),
    market VARCHAR(255),
    UNIQUE (admin1, admin2, market)
);

CREATE TABLE dim_commodity (
    commodity_id SERIAL PRIMARY KEY,
    category VARCHAR(255),
    commodity_name VARCHAR(255),
    unit VARCHAR(255),
    UNIQUE (category, commodity_name, unit)
);

CREATE TABLE dim_market_type (
    market_type_id SERIAL PRIMARY KEY,
    market_type VARCHAR(255) UNIQUE
);

CREATE TABLE fact_food_prices (
    price_id SERIAL PRIMARY KEY,
    date_id INTEGER REFERENCES dim_date(date_id),
    location_id INTEGER REFERENCES dim_location(location_id),
    commodity_id INTEGER REFERENCES dim_commodity(commodity_id),
    market_type_id INTEGER REFERENCES dim_market_type(market_type_id),
    price_kes NUMERIC(10, 2),
    price_usd NUMERIC(10, 2),
    UNIQUE (date_id, location_id, commodity_id, market_type_id)
);

3.Storing Credentials Securely
Create a .env file in project root to safely store credentials.

DB_HOST=localhost
DB_NAME=kenya_food_prices
DB_USER=your_postgres_user
DB_PASSWORD=your_postgres_password
DB_PORT=5432

We will use python-dotenv to load the credentials securely in our python script.

Step 2: Python ETL Script

Part 1: Extraction and Transformation
First we use Pandas to read and extract the csv.
The CSV file has an extra header row we need to skip and messy column names.

import pandas as pd
import os
from dotenv import load_dotenv
from sqlalchemy import create_engine

# Load environment variables
load_dotenv()
db_host = os.getenv("DB_HOST")
# ... (load other db variables)
conn_string = f'postgresql://{db_user}:{db_password}@{db_host}/{db_name}'
engine = create_engine(conn_string)

def extraction_and_transformation():
    # Extraction: Read CSV, skipping the second row which is a comment
    df = pd.read_csv('wfp_food_prices_ken.csv', skiprows=[1])

    # Transformation
    df = df.dropna()
    df.columns = df.columns.str.strip() # Clean column names
    df.drop_duplicates(inplace=True)
    df.reset_index(drop=True, inplace=True)

    print("Data extracted and transformed successfully!")
    return df

Part 2: Populating the Database

We make our script idempotent- this means it can run multiple times without creating duplicate data.

We'll start with the location dimension:

def load_location_dimension(df):
    print("Loading location dimension...")

    # 1. Select unique locations from the source data
    location_cols = ['admin1', 'admin2', 'market']
    locations = df[location_cols].drop_duplicates()

    # 2. Get existing locations from the database
    existing = pd.read_sql("SELECT admin1, admin2, market FROM dim_location", engine)

    # 3. Find locations that are in our source but NOT in the database
    new_locations = locations.merge(existing, on=location_cols, how='left', indicator=True)
    new_locations = new_locations[new_locations['_merge'] == 'left_only'].drop('_merge', axis=1)

    # 4. If there are new locations, append them
    if not new_locations.empty:
        new_locations.to_sql('dim_location', engine, if_exists='append', index=False)
        print(f"Inserted {len(new_locations)} new location records!")
    else:
        print("✔️ No new location records to add.")

Use the same logic for the other dimensions!

Loading the Fact Table
We replace the descriptive text (e.g.,Beans, Nairobi) with corresponding foreign keys(location_id, commodity_id) from our dimension tables.

We manage to do this via a series of pd.merge calls, this is Pandas equivalent to SQL JOIN.

def load_fact_table(df):
    print("Loading fact table...")

    # Get dimension tables from the DB, now with their generated IDs
    date_dim = pd.read_sql("SELECT date_id, date_value FROM dim_date", engine)
    location_dim = pd.read_sql("SELECT location_id, admin1, admin2, market FROM dim_location", engine)
    commodity_dim = pd.read_sql("SELECT commodity_id, category, commodity_name, unit FROM dim_commodity", engine)
    # ... and so on for other dimensions

    # Prep the dataframe for merging
    fact_df = df.rename(columns={'commodity': 'commodity_name', 'price': 'price_kes'})
    fact_df['date_value'] = pd.to_datetime(df['date']).dt.date

    # Merge with dimensions to get foreign keys
    fact_df = fact_df.merge(date_dim, on='date_value', how='left')
    fact_df = fact_df.merge(location_dim, on=['admin1', 'admin2', 'market'], how='left')
    fact_df = fact_df.merge(commodity_dim, on=['category', 'commodity_name', 'unit'], how='left')
    # ... merge with other dimensions

    # Select only the columns we need for the fact table
    fact_columns = ['date_id', 'location_id', 'commodity_id', 'market_type_id', 'price_kes', 'price_usd']
    fact_table = fact_df[fact_columns].dropna() # Drop rows where a join failed

    # Load into the database (with a similar idempotency check)
    fact_table.to_sql('fact_food_prices', engine, if_exists='append', index=False)

    print("Fact table loaded successfully!")

Step 3: Visualizing Data with Grafana

After cleaning and structuring our data it is now ready for Grafana.
1.Connect PostgreSQL to Grafana:In Grafana, go to Connections > Add new connection, select PostgreSQL, and enter your database credentials.

2.Create a New Dashboard: Add a panel and switch to code editor and enter SQL queries.

Here are some queries to get you started.

Query 1: Price of a Commodity Over Time

Panel Type: Time Series

SELECT
  d.date_value AS "time",      -- Alias for Grafana's time axis
  f.price_kes,                 -- The value to plot
  c.commodity_name AS "metric" -- The name of the series
FROM
  fact_food_prices AS f
JOIN
  dim_date AS d ON f.date_id = d.date_id
JOIN
  dim_commodity AS c ON f.commodity_id = c.commodity_id
JOIN
  dim_location AS l ON f.location_id = l.location_id
WHERE
  $__timeFilter(d.date_value) AND -- Use Grafana's time picker
  l.market = 'Nairobi' AND
  c.commodity_name = 'Maize (white)'
ORDER BY
  d.date_value;

Query 2: Average Price Comparison by Commodity

Panel Type: Bar Chart

SELECT
  c.commodity_name AS "Commodity",
  AVG(f.price_kes) AS "Average Price (KES)"
FROM
  fact_food_prices AS f
JOIN
  dim_commodity AS c ON f.commodity_id = c.commodity_id
JOIN
  dim_date AS d ON f.date_id = d.date_id
WHERE
  $__timeFilter(d.date_value)
GROUP BY
  c.commodity_name
ORDER BY
  "Average Price (KES)" DESC
LIMIT 15;

We've taken raw data, applied ETL principles, modeled it for analytics, and built a interactive dashboard.

Building a Gold (XAUUSD) Trend Tracker with Python and SQLite

Austin Oketch — Mon, 19 May 2025 22:25:01 +0000

XAUUSD is a symbol used in Forex trading to indicate the number of US dollars needed to buy one ounce of gold.

Other abbreviations such as EURUSD indicate the exchange rate of national currency pairs while XAUUSD shows comparison between the price of the precious metal and the rate of the US dollar.
It is possible to buy gold as a physical commodity at banks or from dealers.
Gold is often used by governments that have a large gold reserve to protect the value of their currency; that’s why it is traded on the Forex market.
I chose XAUUSD due to high volume of trading together with an extensive amount of historical and real-time data available.
It is also not specific to any one nation, economy, or business; rather, it is globally essential.
It is more intriguing examining how the price of gold responds to an array of events, including inflation, interest rates, wars, and financial crises.

Objectives

When I was implementing this project what I had in mind was a simple Extraction,Transformation and Load pipeline.
I am using TwelveData forex API for Extraction of raw data from the financial markets in JSON format.
After Extraction we now begin the Transformation of the raw data.This is done to get insight providing a clean dataset for analysis and alerts.
Finally Loading of the data involves the storage of the transformed dataset into a database for presentation or more processing purposes.

Step by Step Build:

Get an API_KEY after setting up an account in TwelveData, don't worry there is a free plan option.

.gitignore file:

.env

.env file:

API_KEY='xxxxxxxxxxxxxxxxxxxxxx'

Add the API_KEY to your .env file - remember to add the .env file in your .gitignore file.
A .gitignore file specifies intentionally untracked files that Git should ignore.
In case you are using version control, this will prevent the pushing of the sensitive API_KEY to GitHub.

extract.py:

Here, extraction of the raw data took place via an API.

import requests
import os
from dotenv import load_dotenv

load_dotenv()

GOLD_API_KEY = os.getenv('API_KEY')

def fetch_xauusd_data():
    url = f"https://api.twelvedata.com/time_series?symbol=XAU/USD&interval=5min&apikey={GOLD_API_KEY}"
    return requests.get(url).json()

import requests

Requests is a python HTTP library.It is commonly used for interacting with web services, APIs and web scraping.
In this context we are using it to send GET requests to retrieve data.

import os

The OS module provides functions used to interact with operating system. The module here is used to work with files and directories.

from dotenv import load_dotenv
load_dotenv()

These lines of code import load_dotenv function from dotenv library.
It loads environmental variables present in the .env file.

GOLD_API_KEY = os.getenv('API_KEY')

This assigns the value of the environmental variable "API_KEY" to the variable GOLD_API_KEY.

def fetch_xauusd_data():
    url = f"https://api.twelvedata.com/time_series?symbol=XAU/USD&interval=5min&apikey={GOLD_API_KEY}"
    return requests.get(url).json()

This function connects to a service that provides financial data, TwelveData API.
It is requesting five-minute intervals of XAUUSD commodity data.
The request is authenticated by the API_KEY stored in the GOLD_API_KEY variable.
The response is given back in JSON format with a similar structure to a Python dictionary.

transform.py:

def transform_data(df):
    df['close'] = df['close'].astype(float)
    df['SMA1O'] = df['close'].rolling(window=10).mean()
    df['candle'] = df.apply(lambda row: 'Bullish' if row['close'] < row['SMA1O'] else 'Bearish', axis=1)
    return df

df['close'] = df['close'].astype(float)

The API sends the data as string text.
This line of code converts the 'close' column in the response to float.
This will enable one to perform calculations like averages on the data in the column.

df['SMA1O'] = df['close'].rolling(window=10).mean()

This line creates a new column called 'SMA1O'
It computes the average of the last 10 'close' prices for every row.

df['candle'] = df.apply(lambda row: 'Bullish' if row['close'] < row['SMA1O'] else 'Bearish', axis=1)

This line adds a new column called 'candle' that labels the market condition based on a simple trend rule:
If the closing price is above the 10-period moving average (SMA10) → label it 'Bullish'.
If the closing price is below the SMA10 → label it 'Bearish'.

load.py:

import sqlite3

def load_data(df, db_name='xauusd_data.db'):
    conn = sqlite3.connect(db_name)
    df.to_sql('xauusd_prices',conn, if_exists='append', index=False)
    conn.close()

import sqlite3

SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.

def load_data(df, db_name='xauusd_data.db'):
    conn = sqlite3.connect(db_name)
    df.to_sql('xauusd_prices',conn, if_exists='append', index=False)
    conn.close()

The load_data function takes two inputs, the transformed DataFrame,df and optional name of sqlite3 database file.
After connecting to the SQLite file the DataFrame is saved to the table.

if_exists='append' - means new rows will be added to the table not overwrite it.
index='False'- don't include Index column in the DataFrame.

Now to combine these scripts into one - *main.py *

from extract import fetch_xauusd_data
from transform import transform_data
from load import load_data
import pandas as pd

def run_etl():
    print("Starting ETL pipeline...")


    print("Starting data extraction...")
    raw_data = fetch_xauusd_data()

    if 'values' not in raw_data:
        print("No data found, Check api configuration or usage!")
        return


    print("Starting data transformation...")
    df = pd.DataFrame(raw_data['values'])
    df = transform_data(df)


    print("Starting data loading...")
    load_data(df)

    print("ETL pipeline finished successfully!")

if __name__ == "__main__":
    run_etl()

I combined all of this into a single main.py file. The entire ETL pipeline is coordinated by:
Retrieving pricing information for gold (XAU/USD) via an API
Converting it to label candles and compute moving averages using Pandas
The output is saved in a SQLite database.

Introduction to Python Programming Language

Austin Oketch — Wed, 16 Apr 2025 09:37:40 +0000

Python's popularity as a programming language has steadily increased over the years since its introduction in 1991 by Guido van Rossum. Python is quite popular due to its easy-to-read nature.This can be seen with the indentation blocks that make the code to be visually appealing for easy understanding by collaborators on a project.A beginner can easily understand the code with minimal programming experience.Another reason is due to its versatility as it provides utility in most software engineering use cases ranging from software development to machine learning.

Python use cases

These are some of the applications of python in the world of software engineering:

a) Data Analysis and Science.This includes data cleaning and manipulation, Statistical analysis and Data Visualization.

b) Software Development.Python together with frameworks like Flask and Django can be used in the development of web applications.

c) Scripting and Automation.Python can also be used in writing scripts that automate tasks eg for software testing and data processing.

Python compared to other programming languages

Python usually uses new lines to complete a command. Other programming languages like JavaScript use semi-colons.
Python is more flexible compared to the other programming languages as it can be used in a Procedural, Function or Object-oriented programming way.
Python was designed with readability in mind hence easier to understand code compared to the other programming languages.

Getting started with Python
The python version that is mainly being used currently is python 3.

Most Windows,Mac and Linux systems nowadays come with Python installed.

To check if you have Python installed in your Windows PC open the command-line application(cmd.exe) and enter the following command:

C:\Users\username>python --version

For MacOS or Linux systems open terminal/console and enter the following command:

python --version

If python is not installed in your PC then you can download it from
https://www.python.org/

If you want to access the python command line, enter the following command in the terminal of your system:

# username @ test_computer in ~ [20:27:34] 
$ python3
Python 3.11.2 (main, Nov 30 2024, 21:22:50) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

Here you can directly write syntax and execute python commands.

Programmers use code editors eg Visual Studio Code and Intergrated Development Environment(IDE) like Pycharm for software development and production.

Python Syntax Examples

Python syntax is usually given a nickname, "executable pseudo code".
An example will explain why:

print("Hello world")

This piece of code will do exactly what you think it will, give output "Hello world" in the console.

.py format is the standard for writing and executing python programs.

If you want to run Python program in terminal enter the following command

python3 app.py

Variables

Variables are containers for storing data values.In python a variable is created the moment you first assign a value to it.They also do not need to be declared with a certain type, the type can even change after being set.
Variables can store data of different types, and different types can do different things.

age = 5 # age is of type int
age = 'five' # age is now a str
print(age) # output will now be five

Rules for Python variables:

A variable name must start with a letter or the underscore character
A variable name cannot start with a number
A variable name can only contain alpha-numeric characters and underscores (A-z, 0-9, and _ )
Variable names are case-sensitive (age, Age and AGE are three different variables)
A variable name cannot be any of the Python keywords.

Python has the following data types:

Text type: String
Strings are used to store text data and are immutable(cannot be changed after they are created)

name = "Kamau"

Numeric Types: int, float
Int are whole numbers without a decimal point while float have a decimal point

buying_price = 250
vat = 45.60

Sequence Types: list, tuple, range

Lists are mutable, allowing you to modify their content, while tuples are immutable, meaning you can’t change them after creation.
Range consists of a sequence of data values often used in loops.

fruits = ["apple", "banana", "cherry"]  # list
coordinates = (10, 20)                  # tuple
for i in range(3):
    print(fruits[i])                   # prints each fruit

Mapping Type: dict

Dictionaries are used to store values in key value pairs.

person = {"name": "Alice", "age": 30}
print(person["name"])  # Outputs: Alice

Set Types: set
Sets are used to store multiple items in a single variable.

unique_numbers = {1, 2, 3, 3, 2}
print(unique_numbers)  # Outputs: {1, 2, 3}

Boolean Type: bool
Boolean represents one of two values: True or False

is_sunny = True
if is_sunny:
    print("Wear sunglasses!")  # Will print

Python Operators

Arithmetic Operators

Arithmetic operators are used with numeric values to perform common mathematical operations:

+ Addition (x + y)
- Subtraction (x - y)
* Multiplication (x * y)
/ Division (x / y)
% Modulus (x % y)
** Exponentiation (x ** y)
// Floor division (x // y)

print(20+2) # answer = 22
print(20-2) # answer = 18
print(8*6) # answer = 48
print(18/2) # answer = 9
print(2 ** 6)  # answer is 64 (exponent)
print(17 % 2)  # answer is 1 (remainder)
print(11 // 2)  # 5 (floor division)

Logical, Comparison and Conditional operators

Comparison operators check relationships between values:
== (equal to)
!= (not equal to)
< (less than)

(greater than)
<= (less than or equal to)
= (greater than or equal to)

Logical operators to combine conditions:

and (both conditions must be true)
or (at least one condition must be true)
not (inverts a boolean value)

Conditional statements to make decisions:

if (executes code if condition is true)
elif (checks another condition if previous conditions were false)
else (executes if all previous conditions were false)

temperature = 25  # in Celsius

# Comparison operators
is_freezing = temperature <= 0
is_hot = temperature >= 30
is_warm = temperature > 20

# Logical operators
wear_jacket = is_freezing or (temperature < 15)
go_swimming = is_hot and not is_freezing

# Conditional statements
if is_freezing:
    print("It's freezing! Wear a heavy coat.")
elif is_hot:
    print("It's hot! Consider wearing light clothes.")
else:
    print("The temperature is moderate.")

Conditional statements - Loops

Conditional statements are used to execute a block of code multiple times until certain conditions are met.
This reduces code duplication making code more readable.

Types of loops in python
For loop
While loop

Python is a great beginner friendly general programming language to start your journey in the world of software development due to its easy to read and understand nature.Patience and some programming fundamentals is all you require.