DEV Community

Unlock the Power of Parsing CSV Files with Python

Around 40% of data analysts rely on Python daily to handle CSV files. Why is that? Parsing CSV data is a routine task, and when automated, it streamlines work, boosting both speed and efficiency. If you've ever needed to manage, analyze, or simply read CSV files in Python, this guide will walk you through the process. You'll learn how to handle it effortlessly without relying on complex external libraries, and we’ll explore how to use Pandas for more advanced tasks.

Exploring CSV File

In simple terms, a CSV (Comma Separated Values) file is a plain-text file that organizes data in a tabular format, where columns are separated by commas and rows by new lines. It’s the go-to format for sharing data across different platforms because almost any program can open and process CSVs.
Its simplicity and universal accessibility make CSVs perfect for a wide range of applications, from Excel sheets to massive databases. But how do you extract value from these files programmatically? Let’s dive into that now.

Reading CSV Files in Python

Python makes it incredibly easy to handle CSV files—no external libraries required. Let's walk through opening and reading a CSV file.

import csv

with open('university_records.csv', 'r') as csv_file:
    reader = csv.reader(csv_file)
    for row in reader:
        print(row)
Enter fullscreen mode Exit fullscreen mode

We simply open the CSV file, use Python’s built-in csv.reader() to read each row, and then print it out. That’s the starting point.

Writing Data into CSV Files with Python

Need to add or update some data? Python’s csv.writer() method has you covered. Here’s how to add new rows to an existing CSV:

import csv

row = ['David', 'MCE', '3', '7.8']
row1 = ['Monika', 'PIE', '3', '9.1']
row2 = ['Raymond', 'ECE', '2', '8.5']

with open('university_records.csv', 'a') as csv_file:
    writer = csv.writer(csv_file)
    writer.writerow(row)
    writer.writerow(row1)
    writer.writerow(row2)
Enter fullscreen mode Exit fullscreen mode

What’s happening here? We’re appending rows to the CSV file, one at a time. Easy, right?

Mastering CSV Parsing with Pandas

When your CSV files grow in size or complexity, Python’s built-in tools can start to struggle. Enter Pandas: a powerful library designed for handling large data sets with ease. It’s fast, flexible, and comes with tools you never knew you needed.
Let’s load a CSV with Pandas:

import pandas as pd

data = {"Name": ["David", "Monika", "Raymond"],
        "Age": [30, 25, 40], 
        "City": ["Kyiv", "Lviv", "Odesa"]}

df = pd.DataFrame(data) 
file_path = "data.csv" 
df.to_csv(file_path, index=False, encoding="utf-8")
Enter fullscreen mode Exit fullscreen mode

Pandas allows us to easily convert a dictionary into a DataFrame (the core structure for storing tabular data in Pandas), and then we save it to a CSV.

Why Pandas is a Game-Changer for CSV Files

You might be wondering: "Why should we use Pandas when Python’s built-in csv library works just fine?" Good question! Let’s look at why Pandas is considered a game-changer:

  • Effortless file handling: If you have datasets from multiple sources with inconsistent formats, Pandas handles this seamlessly. No need to manually clean or structure the data.
  • Performance: Unlike the basic CSV reader, Pandas can efficiently handle large datasets, often outperforming standard Python libraries when it comes to scalability.
  • Built-in data cleaning: Missing values, duplicate data, or incorrect formats? Pandas handles this automatically, saving you hours of cleanup.

Exploring CSVs with Pandas

Let’s check out how easy it is to read and explore data using Pandas:

import pandas as pd

df = pd.read_csv("data.csv")

# View the first few rows
print(df.head())

# View the last few rows
print(df.tail(10))

# Get information about the dataset
print(df.info())
Enter fullscreen mode Exit fullscreen mode

Here’s where Pandas shines—head(), tail(), and info() allow you to quickly get a snapshot of your dataset.

Editing CSVs with Pandas

With Pandas, modifying a CSV becomes a breeze. Need to add, update, or remove rows? Here’s how:

  • Insert a new row:
new_row = pd.DataFrame([{"Name": "Denys", "Age": 35, "City": "Kharkiv"}])
df = pd.concat([df, new_row], ignore_index=True)
df.to_csv(file_path, index=False, encoding="utf-8")
Enter fullscreen mode Exit fullscreen mode
  • Edit a specific row:
df.loc[df["Name"] == "Ivan", "Age"] = 26
df.to_csv(file_path, index=False, encoding="utf-8")
Enter fullscreen mode Exit fullscreen mode
  • Remove a row:
df = df[df["Name"] != "Mykhailo"]
df.to_csv(file_path, index=False, encoding="utf-8")
Enter fullscreen mode Exit fullscreen mode

Conclusion

If you're working with smaller, simpler data sets, Python’s native csv library is a great tool for parsing CSV files. But when it comes to larger datasets, data cleaning, and more complex operations, Pandas is the real MVP. It’s designed for heavy lifting, with built-in methods that save time and increase accuracy. From easy file handling to advanced data manipulation, Pandas lets you work smarter, not harder.

Playwright CLI Flags Tutorial

5 Playwright CLI Flags That Will Transform Your Testing Workflow

  • 0:56 --last-failed: Zero in on just the tests that failed in your previous run
  • 2:34 --only-changed: Test only the spec files you've modified in git
  • 4:27 --repeat-each: Run tests multiple times to catch flaky behavior before it reaches production
  • 5:15 --forbid-only: Prevent accidental test.only commits from breaking your CI pipeline
  • 5:51 --ui --headed --workers 1: Debug visually with browser windows and sequential test execution

Learn how these powerful command-line options can save you time, strengthen your test suite, and streamline your Playwright testing experience. Click on any timestamp above to jump directly to that section in the tutorial!

Watch Full Video 📹️

Top comments (0)

Playwright CLI Flags Tutorial

5 Playwright CLI Flags That Will Transform Your Testing Workflow

  • 0:56 --last-failed: Zero in on just the tests that failed in your previous run
  • 2:34 --only-changed: Test only the spec files you've modified in git
  • 4:27 --repeat-each: Run tests multiple times to catch flaky behavior before it reaches production
  • 5:15 --forbid-only: Prevent accidental test.only commits from breaking your CI pipeline
  • 5:51 --ui --headed --workers 1: Debug visually with browser windows and sequential test execution

Learn how these powerful command-line options can save you time, strengthen your test suite, and streamline your Playwright testing experience. Click on any timestamp above to jump directly to that section in the tutorial!

Watch Full Video 📹️

👋 Kindness is contagious

If you found this post helpful, please leave a ❤️ or a friendly comment below!

Okay