In Python backend development, data analytics, and office automation workflows, reading Excel files and converting them into efficient, program-friendly data structures is a common requirement.
Many developers start by accessing spreadsheet data using numeric indexes such as row[2] or col[5]. While this approach is quick to implement, it introduces a serious row/column coupling problem . The code becomes difficult to read and maintain, and even a minor change to the spreadsheet layout—such as reordering, inserting, or removing columns—can break large portions of the application.
In this article, we'll use Spire.XLS for Python to demonstrate a three-level evolution of Excel data modeling:
- Raw two-dimensional lists
- Lists of dictionaries
- Lists of custom business objects
Each approach serves different use cases and complexity levels. By moving beyond hardcoded indexes, you can make your Excel-processing code more readable, maintainable, and scalable.
Prerequisites
All examples in this article use the spire.xls package, which provides Excel reading, writing, formatting, and batch-processing capabilities without requiring Microsoft Excel to be installed.
Install the library with:
pip install spire.xls
Approach 1: Store Excel Data as a Two-Dimensional List
How It Works
This is the most straightforward way to read Excel data. We iterate through the worksheet's used range row by row and cell by cell, storing everything in a two-dimensional list that preserves the original spreadsheet structure.
Complete Example
from spire.xls import Workbook
# Load the workbook and worksheet
workbook = Workbook()
workbook.LoadFromFile("SalesReport.xlsx")
sheet = workbook.Worksheets[0]
# Get the used range
cell_range = sheet.AllocatedRange
# Store all data in a 2D list
excel_data = []
for row_idx in range(cell_range.RowCount):
single_row = []
for col_idx in range(cell_range.ColumnCount):
# Spire.XLS uses 1-based indexes
single_row.append(
cell_range[row_idx + 1, col_idx + 1].Value
)
excel_data.append(single_row)
# Release resources
workbook.Dispose()
Pros and Cons
Advantages
- Extremely simple implementation
- Preserves the original row-and-column structure
- No additional transformation overhead
Drawbacks
All data access relies on numeric indexes:
excel_data[row][col]
The code contains no semantic meaning. If the spreadsheet structure changes, index references throughout the application must be updated manually, making maintenance difficult and error-prone.
Best For
- Quick prototypes
- One-off scripts
- Matrix calculations
- Temporary data inspection
For production applications, this structure is rarely ideal.
Approach 2: Convert Rows into Dictionaries
How It Works
To eliminate hardcoded column indexes, we can use the first row as column headers and map each subsequent row into a dictionary.
Instead of retrieving data by position, we access it by field name. This removes the dependency on column order and greatly improves readability.
Complete Example
from spire.xls import Workbook
workbook = Workbook()
workbook.LoadFromFile("SalesReport.xlsx")
sheet = workbook.Worksheets[0]
cell_range = sheet.AllocatedRange
# Extract headers from the first row
rows = list(cell_range.Rows)
headers = [
cell_range[1, col_idx + 1].Value
for col_idx in range(cell_range.ColumnCount)
]
# Build a list of dictionaries
data_list = []
for row in rows[1:]: # Skip header row
row_dict = {}
for idx, cell in enumerate(row.Cells):
row_dict[headers[idx]] = cell.Value
data_list.append(row_dict)
workbook.Dispose()
Pros and Cons
Advantages
Data can now be accessed using meaningful field names:
data_list[0]["Sales"]
Benefits include:
- Improved readability
- Independence from column order
- Easy JSON serialization
- Convenient integration with APIs and data-processing pipelines
- Better compatibility with Pandas workflows
Drawbacks
Dictionary-based structures are still weakly typed. Values may require additional validation and type conversion before use.
Best For
- Data import/export workflows
- Data cleaning tasks
- API payload generation
- Business reporting
- General-purpose Excel processing
For most applications, this is often the best balance between simplicity and maintainability.
Approach 3: Map Rows to Custom Business Objects
How It Works
When working with fixed schemas and more complex business logic, dictionaries can become limiting. They offer no type safety, no IntelliSense support, and no natural place to encapsulate business rules.
A more robust approach is to define a business entity class and map each Excel row to an object instance.
This creates a strongly typed model that supports validation, business methods, and better developer tooling.
Complete Example
# Business entity definition
class Employee:
def __init__(self, name: str, age: int | None, department: str):
self.name = name
self.age = age
self.department = department
def is_adult(self) -> bool:
"""Return True if the employee is an adult."""
return self.age >= 18 if self.age else False
from spire.xls import Workbook
workbook = Workbook()
workbook.LoadFromFile("EmployeeData.xlsx")
sheet = workbook.Worksheets[0]
cell_range = sheet.AllocatedRange
employee_list = []
# Skip the header row
for row in list(cell_range.Rows)[1:]:
name = row.Cells[0].Value
age = (
int(row.Cells[1].Value)
if row.Cells[1].Value
else None
)
department = row.Cells[2].Value
employee = Employee(
name,
age,
department
)
employee_list.append(employee)
workbook.Dispose()
Advantages and Use Cases
Benefits
- Strong typing through explicit type conversion
- Better data validation
- IDE auto-completion and type hints
- Encapsulation of business logic
- Improved maintainability
- Cleaner object-oriented design
For example:
employee.is_adult()
Business rules can be implemented directly within the entity instead of being scattered throughout the codebase.
Best For
- Enterprise applications
- Stable, well-defined data schemas
- Systems with complex business rules
- Long-term maintainable projects
Choosing the Right Structure
The best choice depends on the complexity of your application and how the data will be used.
| Structure | Advantages | Drawbacks | Recommended Use Cases |
|---|---|---|---|
| Two-Dimensional List | Simple, fast, minimal transformation | Hardcoded indexes, poor readability, difficult maintenance | Quick scripts, data previews, matrix operations |
| List of Dictionaries | Readable, flexible, serialization-friendly | Weak typing, limited validation | Data analysis, data synchronization, API integration |
| List of Custom Objects | Strong typing, extensible, business-logic friendly | Requires additional class definitions | Enterprise projects, stable schemas, complex business workflows |
Important: Always Release Workbook Resources
Every example in this article calls:
workbook.Dispose()
This is an important best practice when working with Spire.XLS .
The library maintains file handles while a workbook is open. If resources are not explicitly released, long-running applications or batch-processing jobs may encounter issues such as:
- Locked Excel files
- Increased memory usage
- Inability to modify files later
- Resource leaks in server environments
For this reason, you should always dispose of the workbook once processing is complete.
Conclusion
The progression from:
- Numeric indexes in two-dimensional lists
- Semantic field access with dictionaries
- Strongly typed business objects
reflects a broader evolution from data-oriented programming toward business-oriented modeling .
Simple scripts do not require elaborate abstractions. For most real-world Excel-processing tasks, a list of dictionaries offers an excellent balance between flexibility and maintainability. When your application involves complex business rules and stable schemas, custom entity objects become the most robust long-term solution.
Choosing the right data structure can significantly reduce code complexity, improve readability, minimize bugs, and make your Excel-processing workflows easier to maintain as your projects grow.
Top comments (0)