DEV Community

Rounak Prasad
Rounak Prasad

Posted on

Understanding Pandas DataFrames (Beginner-Friendly)

When I started learning data science, one of the first tools I came across was Pandas. At first, it felt confusing. But once I understood the basics, it became one of the most powerful tools in Python.

Here’s a simple breakdown of what a Pandas DataFrame actually is and why it matters.

What is a DataFrame?

A DataFrame is a 2-dimensional table, similar to an Excel sheet or a SQL table.

It has:

  • Rows (records)
  • Columns (features)

Each column can store different types of data like numbers, strings, or dates.

Creating a Simple DataFrame

import pandas as pd

data = {
    "Name": ["Rounak", "Aman", "Priya"],
    "Age": [19, 20, 18],
    "Marks": [85, 90, 88]
}

df = pd.DataFrame(data)
print(df)
Enter fullscreen mode Exit fullscreen mode

Basic Operations

1. Viewing Data

df.head()
Enter fullscreen mode Exit fullscreen mode

2. Selecting a Column

df["Name"]
Enter fullscreen mode Exit fullscreen mode

3. Filtering Data

df[df["Marks"] > 85]
Enter fullscreen mode Exit fullscreen mode

4. Adding a New Column

df["Passed"] = df["Marks"] > 40
Enter fullscreen mode Exit fullscreen mode

Why DataFrames are Important

  • Easy to clean and preprocess data
  • Works well with large datasets
  • Integrates with libraries like NumPy and Matplotlib
  • Widely used in real-world data science workflows

Final Thought

Instead of trying to memorize everything, I found it more useful to practice small operations daily. The more I used DataFrames, the more intuitive they became.

If you're just starting out, focus on building small examples like this and gradually increase complexity.

That’s how I’m approaching it.

Top comments (0)