DEV Community

jelizaveta
jelizaveta

Posted on

How to Convert Excel to CSV in Python using Spire.XLS for Python

In the realm of data processing and analysis, the need to convert Excel to CSV in Python is a common and often crucial task. Excel files, while powerful for data entry and complex calculations, can sometimes be cumbersome for programmatic access, especially when dealing with large datasets or integrating with other systems that prefer simpler, delimited formats. CSV (Comma Separated Values) files, on the other hand, offer a lightweight, universally compatible, and easily parsable structure, making them ideal for data exchange, scripting, and database imports. Python, with its extensive ecosystem of libraries, stands out as the go-to language for automating such conversions.

This tutorial introduces Spire.XLS for Python as a robust and efficient solution for this specific data conversion challenge. It simplifies the process of transforming your Excel spreadsheets into CSV format, handling various Excel complexities with ease. Our goal is to provide a comprehensive, step-by-step guide on how to leverage Spire.XLS for Python to seamlessly convert your Excel files into CSV, offering a practical approach to a frequent data handling problem.


Why Choose Spire.XLS for Python for Excel to CSV Conversion?

While several Python libraries can handle Excel files, Spire.XLS for Python offers distinct advantages for Excel to CSV using Spire.XLS for Python. It's a commercial library designed for comprehensive Excel manipulation, providing high fidelity and robust handling of various Excel features, including different file formats (.xls, .xlsx). For direct conversion tasks, Spire.XLS often presents a more straightforward and less code-intensive approach compared to manually parsing with openpyxl and then writing to CSV using Python's built-in csv module, or even compared to pandas for scenarios where you need direct file-to-file conversion without loading into a DataFrame first. Its ease of use and direct conversion capabilities make it an excellent choice for developers seeking an efficient solution.

Setting Up Your Environment

Before we dive into the code, you'll need to install the Spire.XLS library. Open your terminal or command prompt and run the following command:

pip install Spire.XLS
Enter fullscreen mode Exit fullscreen mode

This command will download and install the necessary package, making it available for your Python projects.

Step-by-Step Guide: Converting Excel to CSV

Let's explore how to convert Excel to CSV in Python using Spire.XLS for Python with practical code examples.

Basic Conversion of an Entire Workbook

The simplest scenario involves converting an entire Excel workbook into a single CSV file. Spire.XLS makes this incredibly straightforward.

from Spire.XLS import *
from Spire.XLS.common import *

# Create a Workbook object
workbook = Workbook()

# Load an Excel file
workbook.LoadFromFile("input.xlsx")

# Save the workbook as a CSV file
# The first worksheet will be converted by default if not specified
workbook.SaveToFile("output.csv", FileFormat.CSV)

# Dispose the workbook object
workbook.Dispose()

print("Excel file converted to output.csv successfully!")
Enter fullscreen mode Exit fullscreen mode

In this code:

  • We import the necessary classes from Spire.XLS.
  • A Workbook object is created and then used to load "input.xlsx". Ensure this file exists in the same directory as your script, or provide its full path.
  • workbook.SaveToFile("output.csv", FileFormat.CSV) is the core line that performs the conversion. It takes the output filename and the FileFormat.CSV enumeration as arguments.
  • workbook.Dispose() releases the resources used by the workbook.

Converting a Specific Worksheet to CSV

Often, an Excel file contains multiple sheets, and you might only need to convert a particular one. Spire.XLS allows you to specify which worksheet to convert.

from Spire.XLS import *
from Spire.XLS.common import *

# Create a Workbook object
workbook = Workbook()

# Load an Excel file
workbook.LoadFromFile("input_multiple_sheets.xlsx")

# Get the first worksheet (index 0)
# You can also access by name: workbook.Worksheets["SheetName"]
sheet_to_convert = workbook.Worksheets[0] 

# Save the specific worksheet as a CSV file
# The parameters are output file path, delimiter, and whether to include header
sheet_to_convert.SaveToFile("first_sheet_output.csv", ",", True) # Using comma as delimiter, include header

# Dispose the workbook object
workbook.Dispose()

print("Specific Excel worksheet converted to first_sheet_output.csv successfully!")
Enter fullscreen mode Exit fullscreen mode

Here, workbook.Worksheets[0] accesses the first sheet. You could also use workbook.Worksheets["SheetName"] if you know the sheet's name. The SaveToFile method for a worksheet allows specifying the delimiter (e.g., , for CSV) and whether to include the header row.

Handling Multiple Excel Files

For Python Excel conversion tasks involving numerous files in a directory, you can integrate Spire.XLS within a loop.

import os
from Spire.XLS import *
from Spire.XLS.common import *

input_directory = "excel_files" # Directory containing your Excel files
output_directory = "csv_files"   # Directory to save CSVs

# Create output directory if it doesn't exist
if not os.path.exists(output_directory):
    os.makedirs(output_directory)

for filename in os.listdir(input_directory):
    if filename.endswith(".xlsx") or filename.endswith(".xls"):
        input_path = os.path.join(input_directory, filename)
        output_filename = os.path.splitext(filename)[0] + ".csv"
        output_path = os.path.join(output_directory, output_filename)

        workbook = Workbook()
        workbook.LoadFromFile(input_path)
        workbook.SaveToFile(output_path, FileFormat.CSV)
        workbook.Dispose()
        print(f"Converted '{filename}' to '{output_filename}'")

print("Batch conversion complete!")
Enter fullscreen mode Exit fullscreen mode

This script iterates through all Excel files in a specified input directory, converts each one, and saves the resulting CSVs to an output directory. This demonstrates the power of Spire.XLS for Python in automating data processing Python workflows.

Key Considerations and Best Practices

When performing Python Excel conversion to CSV, consider data types and encoding. Spire.XLS generally handles data types well during conversion, preserving values accurately. For encoding, it's often best to explicitly specify UTF-8 to ensure compatibility with a wide range of characters, especially if your Excel files contain non-ASCII text. While Spire.XLS is robust, always validate your output CSVs, especially with complex or very large Excel files, to ensure data integrity.


Conclusion

This tutorial has demonstrated how efficiently you can convert Excel to CSV in Python using the Spire.XLS for Python library. Its intuitive API and powerful conversion capabilities make it an excellent choice for developers and data professionals needing a reliable tool for Excel to CSV using Spire.XLS for Python. Whether you're dealing with single files or performing batch conversions, Spire.XLS streamlines the process, allowing you to focus on analyzing your data rather than wrestling with conversion complexities. Incorporating this library into your data processing Python toolkit will undoubtedly enhance your automation capabilities and improve your data workflow efficiency.

Top comments (0)