Rounak Prasad

Posted on Apr 10

Understanding Pandas DataFrames (Beginner-Friendly)

#datascience #pandas #machinelearning #python

When I started learning data science, one of the first tools I came across was Pandas. At first, it felt confusing. But once I understood the basics, it became one of the most powerful tools in Python.

Here’s a simple breakdown of what a Pandas DataFrame actually is and why it matters.

What is a DataFrame?

A DataFrame is a 2-dimensional table, similar to an Excel sheet or a SQL table.

It has:

Rows (records)
Columns (features)

Each column can store different types of data like numbers, strings, or dates.

Creating a Simple DataFrame

import pandas as pd

data = {
    "Name": ["Rounak", "Aman", "Priya"],
    "Age": [19, 20, 18],
    "Marks": [85, 90, 88]
}

df = pd.DataFrame(data)
print(df)

Basic Operations

1. Viewing Data

df.head()

2. Selecting a Column

df["Name"]

3. Filtering Data

df[df["Marks"] > 85]

4. Adding a New Column

df["Passed"] = df["Marks"] > 40

Why DataFrames are Important

Easy to clean and preprocess data
Works well with large datasets
Integrates with libraries like NumPy and Matplotlib
Widely used in real-world data science workflows

Final Thought

Instead of trying to memorize everything, I found it more useful to practice small operations daily. The more I used DataFrames, the more intuitive they became.

If you're just starting out, focus on building small examples like this and gradually increase complexity.

That’s how I’m approaching it.

DEV Community