DEV Community

Cover image for Introduction to Python
Enock Kiprotich
Enock Kiprotich

Posted on

Introduction to Python

Python for Data Analytics: From Beginner Basics to Real-World Data Projects

A beginner-friendly guide to understanding Python, data analytics, APIs, JSON data, and working with real-world datasets using Python.


Introduction

Over the past decade, Python has become one of the most important programming languages in the world. From web development and artificial intelligence to cybersecurity and automation, Python continues to dominate many areas of technology. However, one area where Python has had a particularly massive impact is data analytics.

Today, businesses, governments, healthcare institutions, banks, researchers, and social media companies all depend on data to make decisions. Because of this, organizations need professionals who can collect, clean, analyze, and visualize data effectively. Python has become the preferred tool for performing these tasks because it is simple, powerful, and highly flexible.

For beginners entering the tech industry, Python is often recommended as the first programming language to learn. Its readable syntax, large community, and huge collection of libraries make it easier for learners to transition into data analytics and data science.

This article explains what Python is, why it is popular in data analytics, the libraries used in analytics, how Python handles data, and why beginners should consider learning it.


What is Python?

Python is a high-level programming language created by Guido van Rossum and first released in 1991. It was designed to emphasize readability and simplicity, allowing programmers to write code that is easy to understand.

Unlike low-level programming languages that can be difficult to read, Python uses simple English-like syntax. This makes it beginner-friendly and suitable for people who are new to programming.

For example, displaying text in Python is very simple:

print("Hello World")
Enter fullscreen mode Exit fullscreen mode

Because of its simplicity, Python is widely used in:

  • Data analytics
  • Artificial intelligence
  • Machine learning
  • Web development
  • Cybersecurity
  • Automation
  • Scientific computing
  • Software development

Python is also open-source, meaning anyone can use it for free.


Why Python is Popular in Data Analytics

Python has become extremely popular in the data analytics industry for several reasons.

1. Easy to Learn

Python is considered one of the easiest programming languages for beginners. Its syntax is simple and resembles normal English.

age = 25

if age >= 18:
    print("Adult")
Enter fullscreen mode Exit fullscreen mode

Because data analytics already involves statistics, business understanding, and critical thinking, using a simple programming language helps learners focus on analytics instead of struggling with complex code.


2. Large Collection of Libraries

Python has powerful libraries specifically designed for data analysis and visualization.

Library Purpose
Pandas Data cleaning and analysis
NumPy Numerical computations
Matplotlib Data visualization
Seaborn Advanced visualization
Scikit-learn Machine learning
Requests Working with APIs

3. Strong Community Support

Python has one of the largest developer communities in the world. If a learner faces a challenge, there are thousands of tutorials, forums, YouTube videos, and articles available online.


4. Integration with Other Technologies

Python works well with:

  • SQL databases
  • APIs
  • Excel
  • Power BI
  • Tableau
  • Cloud platforms

This flexibility makes Python useful in real-world analytics environments.


Python Libraries Used in Data Analytics

Pandas

Pandas is one of the most important Python libraries for data analytics.

import pandas as pd

data = pd.read_csv("sales.csv")

print(data.head())
Enter fullscreen mode Exit fullscreen mode

Pandas uses a structure called a DataFrame, which looks similar to an Excel table.


NumPy

NumPy is used for numerical computations and mathematical operations.

import numpy as np

numbers = np.array([10, 20, 30])

print(numbers.mean())
Enter fullscreen mode Exit fullscreen mode

Matplotlib

Matplotlib helps analysts create charts and graphs.

import matplotlib.pyplot as plt

x = [1,2,3]
y = [10,20,30]

plt.plot(x, y)
plt.show()
Enter fullscreen mode Exit fullscreen mode

Requests

The Requests library is used to interact with APIs and web services.

import requests

response = requests.get("https://dummyjson.com/products")

print(response.json())
Enter fullscreen mode Exit fullscreen mode

How Python is Used in Data Analytics

Python plays a major role in the data analytics workflow.

The major stages include:

  1. Data collection
  2. Data cleaning
  3. Data analysis
  4. Data visualization
  5. Reporting

Data Collection

Python helps analysts collect data from:

  • CSV files
  • APIs
  • Excel files
  • JSON files
  • Databases

Example:

import pandas as pd

data = pd.read_csv("employees.csv")
Enter fullscreen mode Exit fullscreen mode

Data Cleaning

Raw data is usually messy. It may contain:

  • Missing values
  • Duplicates
  • Errors

Python helps clean data efficiently.

data.drop_duplicates(inplace=True)

data.fillna(0, inplace=True)
Enter fullscreen mode Exit fullscreen mode

Data Analysis

Python helps identify patterns and insights.

sales = data["revenue"].sum()

print(sales)
Enter fullscreen mode Exit fullscreen mode

Grouping data:

grouped = data.groupby("department")["revenue"].sum()

print(grouped)
Enter fullscreen mode Exit fullscreen mode

Data Visualization

Visualization helps businesses understand data better.

import matplotlib.pyplot as plt

departments = ["HR", "IT", "Finance"]
revenue = [2000, 5000, 3000]

plt.bar(departments, revenue)

plt.title("Revenue by Department")

plt.show()
Enter fullscreen mode Exit fullscreen mode

Real-World Applications of Python

Healthcare

Hospitals use Python to:

  • Analyze patient records
  • Predict diseases
  • Monitor hospital performance

Banking and Finance

Banks use Python for:

  • Fraud detection
  • Credit scoring
  • Risk analysis

E-commerce

Online stores use Python to analyze:

  • Customer behavior
  • Product sales
  • Market trends

Why Beginners Should Learn Python

High Demand in the Job Market

Many companies are looking for professionals with Python skills.

Common roles include:

  • Data Analyst
  • Data Scientist
  • Business Analyst

Beginner-Friendly

Python is easier to read and understand than many other programming languages.


Strong Salary Potential

Professionals with Python and analytics skills often earn competitive salaries globally.


Part 2: Working with JSON Data from GitHub

Step 1: Import Required Libraries

import pandas as pd
import requests
Enter fullscreen mode Exit fullscreen mode

Step 2: Define the Raw GitHub URL

url = "https://raw.githubusercontent.com/yourusername/repository/main/data.json"
Enter fullscreen mode Exit fullscreen mode

Step 3: Send Request to GitHub

response = requests.get(url)
Enter fullscreen mode Exit fullscreen mode

Step 4: Extract JSON Data

data = response.json()
Enter fullscreen mode Exit fullscreen mode

Step 5: Convert JSON to DataFrame

df = pd.DataFrame(data)
Enter fullscreen mode Exit fullscreen mode

Step 6: Save as CSV File

df.to_csv("synthetic_data.csv", index=False)
Enter fullscreen mode Exit fullscreen mode

Complete Python Script

import pandas as pd
import requests

url = "https://raw.githubusercontent.com/yourusername/repository/main/data.json"

response = requests.get(url)

data = response.json()

df = pd.DataFrame(data)

print(df.head())

df.to_csv("synthetic_data.csv", index=False)

print("CSV file saved successfully")
Enter fullscreen mode Exit fullscreen mode

Part 3: Working with DummyJSON API Endpoints

Products Endpoint

API URL:

https://dummyjson.com/products
Enter fullscreen mode Exit fullscreen mode

Python Code for Products Endpoint

import pandas as pd
import requests

url = "https://dummyjson.com/products"

response = requests.get(url)

data = response.json()

products = data["products"]

df = pd.DataFrame(products)

print(df.head())

df.to_csv("products.csv", index=False)

print("Products CSV saved successfully")
Enter fullscreen mode Exit fullscreen mode

Carts Endpoint

API URL:

https://dummyjson.com/carts
Enter fullscreen mode Exit fullscreen mode

Python Code for Carts Endpoint

import pandas as pd
import requests

url = "https://dummyjson.com/carts"

response = requests.get(url)

data = response.json()

carts = data["carts"]

df = pd.DataFrame(carts)

print(df.head())

df.to_csv("carts.csv", index=False)

print("Carts CSV saved successfully")
Enter fullscreen mode Exit fullscreen mode

Importance of APIs in Data Analytics

APIs allow analysts to collect real-time data from:

  • Websites
  • Financial systems
  • Weather services
  • Social media platforms

Modern analytics heavily depends on API integrations.


Conclusion

Python has become one of the most important tools in the data analytics industry. Its simplicity, flexibility, and powerful libraries make it ideal for beginners and professionals alike.

With Python, analysts can collect, clean, analyze, visualize, and automate data-related tasks efficiently. Organizations across healthcare, finance, e-commerce, social media, and government continue to rely on Python for decision-making and business intelligence.

Additionally, understanding APIs, JSON data, GitHub raw files, and CSV processing gives learners practical experience in handling real-world data workflows.

As technology continues to evolve, Python will remain one of the leading tools for data analytics, machine learning, and artificial intelligence.


Top comments (0)