DEV Community

Allen Yang
Allen Yang

Posted on

Efficiently Transforming CSV Data to Styled Excel Worksheets with Python

CSV to Excel Conversion and Styling with Python

Raw data, often residing in CSV files, is the lifeblood of many organizations. However, presenting this data directly can be challenging. CSVs are excellent for data storage and interchange due to their simplicity, but they lack the visual structure and formatting capabilities required for effective analysis and professional reporting. This is where Excel steps in. With its robust formatting options, charting tools, and user-friendly interface, Excel transforms raw data into understandable and actionable insights.

The manual process of converting CSVs to Excel and then meticulously applying styles can be time-consuming and prone to errors, especially with large datasets or recurring tasks. This article will guide you through automating this process using Python, enabling you to convert your CSV data into professional, styled Excel spreadsheets efficiently. We'll explore how to not only transfer data but also to enhance its readability and impact through various formatting techniques, leveraging a powerful Python library designed for comprehensive Excel manipulation.

Preparing Your Python Environment for Excel Automation

To effectively interact with and style Excel files in Python, a dedicated library is essential. While several options exist, some offer more extensive control over Excel's features, particularly when it comes to intricate styling and advanced functionalities. For this tutorial, we will be using a robust library that provides deep integration with Excel's object model, allowing for precise control over formatting and data presentation.

To get started, you'll need to install the library. Open your terminal or command prompt and execute the following command:

pip install Spire.XLS
Enter fullscreen mode Exit fullscreen mode

This command will download and install Spire.XLS for Python, an API designed to create, read, edit, and convert Excel files. Its comprehensive feature set, including support for a wide range of Excel versions and extensive styling capabilities, makes it an excellent choice for automating complex Excel tasks, from basic data transfer to advanced report generation.

Fundamental CSV to Excel Conversion

Before diving into styling, let's establish the basic process of converting a CSV file into an Excel workbook. This fundamental step involves reading the CSV content and then saving it as an .xlsx file.

Here’s a simple Python script to perform this basic conversion:

from spire.xls import *

def convert_csv_to_excel_basic(csv_file_path, excel_file_path):
    # Create a new workbook instance
    workbook = Workbook()

    # Load the CSV file into the first worksheet of the workbook
    # The delimiter is specified as ',' and it starts loading from row 1, column 1
    workbook.LoadFromFile(csv_file_path, ",", 1, 1)

    # Get the first worksheet
    sheet = workbook.Worksheets[0]

    # Optionally, ignore errors for numbers stored as text in a specific range
    # This can prevent green triangles in Excel for certain data types
    # For a general conversion, this line might be omitted or adjusted
    # sheet.Range["D2:E19"].IgnoreErrorOptions = IgnoreErrorType.NumberAsText

    # Auto-fit columns to ensure all content is visible
    sheet.AllocatedRange.AutoFitColumns()

    # Save the workbook to the specified Excel file path in Excel 2013 format
    workbook.SaveToFile(excel_file_path, ExcelVersion.Version2013)
    workbook.Dispose()
    print(f"Successfully converted '{csv_file_path}' to '{excel_file_path}'")

# Example usage:
with open("sample_data.csv", "w") as f:
    f.write("Name,Age,City,Date,Amount\n")
    f.write("Alice,30,New York,2023-01-15,150.75\n")
    f.write("Bob,24,London,2023-02-20,200.00\n")
    f.write("Charlie,35,Paris,2023-03-10,99.50\n")
    f.write("David,28,Berlin,2023-04-05,320.25\n")

csv_input_path = "sample_data.csv"
excel_output_path = "output_basic.xlsx"
convert_csv_to_excel_basic(csv_input_path, excel_output_path)
Enter fullscreen mode Exit fullscreen mode

Below is a preview of the output Excel file:

Basic CSV to Excel Conversion with Python

In this code, Workbook() creates an empty Excel file. workbook.LoadFromFile() then reads your CSV data directly into the first sheet. Finally, sheet.AllocatedRange.AutoFitColumns() adjusts the column widths to fit their content, and workbook.SaveToFile() saves the result. This script provides a functional Excel file, but it lacks any visual enhancements.


Implementing Essential Excel Styling for Clarity

While the basic conversion is functional, raw data in Excel can still be hard to read. Implementing essential styling, such as formatting headers, adjusting column widths, and applying proper number and date formats, dramatically improves clarity and professionalism.

Formatting Headers and Auto-fitting Columns

Headers are crucial for understanding data. Making them stand out with bold formatting and ensuring columns are wide enough to display all content are fundamental steps in creating a readable spreadsheet.

from spire.xls import *

def style_excel_headers_and_columns(csv_file_path, excel_file_path):
    workbook = Workbook()
    workbook.LoadFromFile(csv_file_path, ",", 1, 1)
    sheet = workbook.Worksheets[0]

    # Get the range for the header row (assuming headers are in the first row)
    header_range = sheet.Range["A1:E1"] # Adjust 'E1' based on your actual column count

    # Create a style for the header
    header_style = workbook.Styles.Add("HeaderStyle")
    header_style.Font.IsBold = True
    header_style.Font.Size = 11
    header_style.KnownColor = ExcelColors.Gray25Percent # Set background color
    header_style.HorizontalAlignment = HorizontalAlignType.Center

    # Apply the header style
    header_range.CellStyleName = header_style.Name

    # Auto-fit columns for the entire allocated range (which now includes headers)
    sheet.AllocatedRange.AutoFitColumns()

    workbook.SaveToFile(excel_file_path, ExcelVersion.Version2013)
    workbook.Dispose()
    print(f"Successfully styled headers and auto-fitted columns in '{excel_file_path}'")

csv_input_path = "sample_data.csv"
excel_output_path = "output_styled_headers.xlsx"
style_excel_headers_and_columns(csv_input_path, excel_output_path)
Enter fullscreen mode Exit fullscreen mode

Below is a preview of the output Excel file with styled headers:

Style Headers of CSV to Excel Conversion with Python

In this example, we create a new style (HeaderStyle), set its font to bold, assign a light gray background, and center the text. This style is then applied to the first row, which typically contains the headers. sheet.AllocatedRange.AutoFitColumns() is called again to ensure all content, including potential longer header texts, is fully visible.

Applying Number and Date Formatting

Raw data often treats numbers and dates as plain text, leading to formatting inconsistencies or incorrect calculations in Excel. Explicitly setting number and date formats ensures data integrity and improves readability.

from decimal import Decimal
from spire.xls import *

def format_numbers_and_dates(csv_file_path, excel_file_path):
    workbook = Workbook()
    workbook.LoadFromFile(csv_file_path, ",", 1, 1)
    sheet = workbook.Worksheets[0]

    # Apply header style (re-using previous logic for a complete example)
    header_range = sheet.Range["A1:E1"]
    header_style = workbook.Styles.Add("HeaderStyle_FND")
    header_style.Font.IsBold = True
    header_style.Font.Size = 11
    header_style.KnownColor = ExcelColors.Gray25Percent
    header_style.HorizontalAlignment = HorizontalAlignType.Center
    header_range.CellStyleName = header_style.Name

    # Format the 'Amount' column as currency (assuming it's column E, starting from row 2)
    amount_column_range = sheet.Range["E2:E" + str(sheet.LastRow)]
    amount_style = workbook.Styles.Add("CurrencyStyle")
    for row in range(2, sheet.LastRow + 1):
        cell = sheet.Range[row, 5]
        cell.NumberValue = Decimal(cell.Text)
    amount_style.NumberFormat = "$#,##0.00" # Example currency format
    amount_column_range.CellStyleName = amount_style.Name

    # Format the 'Date' column as a short date (assuming it's column D, starting from row 2)
    date_column_range = sheet.Range["D2:D" + str(sheet.LastRow)]
    date_style = workbook.Styles.Add("DateStyle")
    date_style.NumberFormat = "yyyy-mm-dd" # Example date format
    date_column_range.CellStyleName = date_style.Name

    sheet.AllocatedRange.AutoFitColumns()

    workbook.SaveToFile(excel_file_path, ExcelVersion.Version2013)
    workbook.Dispose()
    print(f"Successfully formatted numbers and dates in '{excel_file_path}'")

csv_input_path = "sample_data.csv"
excel_output_path = "output_formatted_data.xlsx"
format_numbers_and_dates(csv_input_path, excel_output_path)
Enter fullscreen mode Exit fullscreen mode

Below is a preview of the output Excel file with formatted numbers and dates:

Format Numbers and Dates of CSV to Excel Conversion with Python

Here, we select specific column ranges (E2:E for Amount, D2:D for Date) and apply custom number formats using style.NumberFormat. This ensures that monetary values are displayed with currency symbols and two decimal places, and dates are presented in a consistent YYYY-MM-DD format. These small details significantly enhance the professional appearance and utility of your data.


Advanced Styling Techniques for Professional Data Presentation

Beyond basic formatting, advanced styling techniques like borders, background colors, and conditional formatting can further elevate your Excel reports, making them more visually appealing and easier to interpret.

Adding Borders and Background Colors

Visual separation and grouping of data can be achieved effectively using borders and alternating row colors. This helps guide the reader's eye and distinguishes different data elements.

from decimal import Decimal
from spire.xls import *

def apply_advanced_styling(csv_file_path, excel_file_path):
    workbook = Workbook()
    workbook.LoadFromFile(csv_file_path, ",", 1, 1)
    sheet = workbook.Worksheets[0]

    # Apply previous header and data formatting for a complete example
    # Header style
    header_range = sheet.Range["A1:E1"]
    header_style = workbook.Styles.Add("HeaderStyle_Adv")
    header_style.Font.IsBold = True
    header_style.Font.Size = 11
    header_style.KnownColor = ExcelColors.Gray25Percent
    header_style.HorizontalAlignment = HorizontalAlignType.Center
    header_range.CellStyleName = header_style.Name

    # Currency format
    amount_column_range = sheet.Range["E2:E" + str(sheet.LastRow)]
    for row_index in range(2, sheet.LastRow + 1):
        cell = sheet.Range[row_index, 5]
        cell.NumberValue = Decimal(cell.Value)
    amount_style = workbook.Styles.Add("CurrencyStyle_Adv")
    amount_style.NumberFormat = "$#,##0.00"
    amount_column_range.CellStyleName = amount_style.Name

    # Date format
    date_column_range = sheet.Range["D2:D" + str(sheet.LastRow)]
    date_style = workbook.Styles.Add("DateStyle_Adv")
    date_style.NumberFormat = "yyyy-mm-dd"
    date_column_range.CellStyleName = date_style.Name

    # Apply borders to the entire data range (including headers)
    data_range = sheet.Range["A1:E" + str(sheet.LastRow)]
    data_range.BorderAround(LineStyleType.Thin, Color.get_Black())
    data_range.BorderInside(LineStyleType.Thin, Color.get_LightGray())

    # Apply alternating row background color for better readability
    # Start from row 2 (after headers)
    for row_index in range(2, sheet.LastRow + 1):
        if row_index % 2 == 0: # Even rows
            row_range = sheet.Range[row_index, 1, row_index, sheet.LastColumn]
            row_range.Style.KnownColor = ExcelColors.LightYellow # Subtle background

    sheet.AllocatedRange.AutoFitColumns()

    workbook.SaveToFile(excel_file_path, ExcelVersion.Version2013)
    workbook.Dispose()
    print(f"Successfully applied advanced styling in '{excel_file_path}'")

csv_input_path = "sample_data.csv"
excel_output_path = "output_advanced_styled.xlsx"
apply_advanced_styling(csv_input_path, excel_output_path)
Enter fullscreen mode Exit fullscreen mode

Below is a preview of the output Excel file with advanced styling applied:

Advanced Styling of CSV to Excel Conversion with Python

In this script, data_range.BorderAround() adds a thin black border around the entire dataset, while data_range.BorderInside() adds lighter gray borders between cells. For alternating rows, a loop iterates through the data rows, applying a LightYellow background color to even-numbered rows, creating a visually distinct pattern that improves navigability.

Implementing Simple Conditional Formatting

Conditional formatting highlights data based on specific rules, drawing immediate attention to critical information.

from decimal import Decimal
from spire.xls import *

def apply_conditional_formatting(csv_file_path, excel_file_path):
    workbook = Workbook()
    workbook.LoadFromFile(csv_file_path, ",", 1, 1)
    sheet = workbook.Worksheets[0]

    # ... (include previous header, number, date, border, and alternating row styling for context) ...
    header_range = sheet.Range["A1:E1"]
    header_style = workbook.Styles.Add("HeaderStyle_CF")
    header_style.Font.IsBold = True
    header_style.Font.Size = 11
    header_style.KnownColor = ExcelColors.Gray25Percent
    header_style.HorizontalAlignment = HorizontalAlignType.Center
    header_range.CellStyleName = header_style.Name

    amount_column_range = sheet.Range["E2:E" + str(sheet.LastRow)]
    for row_index in range(2, sheet.LastRow + 1):
        cell = sheet.Range[row_index, 5]
        cell.NumberValue = Decimal(cell.Value)
    amount_style = workbook.Styles.Add("CurrencyStyle_CF")
    amount_style.NumberFormat = "$#,##0.00"
    amount_column_range.CellStyleName = amount_style.Name

    date_column_range = sheet.Range["D2:D" + str(sheet.LastRow)]
    date_style = workbook.Styles.Add("DateStyle_CF")
    date_style.NumberFormat = "yyyy-mm-dd"
    date_column_range.CellStyleName = date_style.Name

    data_range = sheet.Range["A1:E" + str(sheet.LastRow)]
    data_range.BorderAround(LineStyleType.Thin, Color.get_Black())
    data_range.BorderInside(LineStyleType.Thin, Color.get_LightGray())

    for row_index in range(2, sheet.LastRow + 1):
        if row_index % 2 == 0:
            row_range = sheet.Range[row_index, 1, row_index, sheet.LastColumn]
            row_range.Style.KnownColor = ExcelColors.LightYellow


    # Apply conditional formatting to the 'Amount' column (E)
    # Highlight amounts greater than 200 with a green background
    conditional_format_range = sheet.Range["E2:E" + str(sheet.LastRow)]
    cf = conditional_format_range.ConditionalFormats.AddCondition()
    cf.FormatType = ConditionValueType.Number
    cf.Operator = ComparisonOperatorType.Greater
    cf.FirstFormula = "190" # Value to compare against
    cf.BackColor = Color.get_LightGreen() # Background color for highlighted cells

    sheet.AllocatedRange.AutoFitColumns()

    workbook.SaveToFile(excel_file_path, ExcelVersion.Version2013)
    workbook.Dispose()
    print(f"Successfully applied conditional formatting in '{excel_file_path}'")

csv_input_path = "sample_data.csv"
excel_output_path = "output_conditional_formatted.xlsx"
apply_conditional_formatting(csv_input_path, excel_output_path)
Enter fullscreen mode Exit fullscreen mode

Below is a preview of the output Excel file with conditional formatting applied:

Conditional Formatting of CSV to Excel Conversion with Python

In this final enhancement, we target the Amount column and add a conditional formatting rule. cf.FormatType = ConditionValueType.Number specifies that the rule is based on the cell's value. cf.Operator = ComparisonOperatorType.Greater sets the condition, and cf.FirstFormula = "190" defines the threshold. Cells meeting this condition will have their background color changed to LightGreen, instantly drawing attention to higher amounts.


Streamlining Data Presentation Workflows

This tutorial has guided you through a practical journey of transforming raw CSV data into professionally styled Excel spreadsheets using Python. We started with a fundamental CSV-to-Excel conversion, then progressively enhanced the output with crucial styling elements: bold headers, auto-fitted columns, precise number and date formatting, and advanced visual cues like borders, alternating row colors, and conditional formatting.

The power of Python, especially when combined with a comprehensive library like Spire.XLS for Python, lies in its ability to automate these intricate tasks. By scripting these processes, you eliminate manual effort, reduce the risk of human error, and ensure consistent, high-quality data presentation across all your reports. This automation not only saves valuable time but also elevates the professionalism and clarity of your data analysis and reporting workflows.

Top comments (0)