Introduction
If you just started or already coded using Python and like Object Oriented Programming but aren't familiar with the dataclasses
module, you came to the right place!
In this article, we will learn:
- What are data classes, and what are their benefits.
- How exactly they are different from regular Python classes.
- And when you should use them.
Data Classes Background
Data classes are used mainly to model data in Python. It decorates regular Python classes and has no restrictions, which means it can behave like a typical class.
A small example of a Data Class:
from dataclasses import dataclass
@dataclass
class Car:
color: str
manufacturer: str
top_speed_km: int
dataclasses
was introduced in Python 3.7 as part of PEP 557
Let's dive into some code examples
The Benefits of Data Class
Special methods build-in implementation
When using the @dataclass decorator we don't have to implement special methods ourselves, which helps us avoid boilerplate code, like the init method (_init_ ), string representation method (_repr_ ), methods that are used for ordering objects (e.g. lt, le, gt, and ge), these compare the class as if it were a tuple of its fields, in order.
Read about a few other extra built-in methods in the official documentation.
How will it look with a regular class:
class Car:
color: str
manufacturer: str
top_speed_km: int
def __init__(self, color: str, manufacturer: str, top_speed_km: bool):
self.color = color
self.manufacturer = manufacturer
self.top_speed_km = top_speed_km
def __lt__(self, other_car):
return self.top_speed_km < other_car.top_speed_km
red_ferrari = Car(color='red', manufacturer='Ferrari', top_speed_km=320)
print(red_ferrari) # <__main__.Car object at 0x7f218789ca00>
black_ferrari = Car(color='red', manufacturer='Ferrari', top_speed_km=347)
print(red_ferrari < black_ferrari) # True
Note those two points:
- Because we didn't implement the _repr_ special method, when we print the Car instance, we get the name of the class and the object address.
- To compare between 2 Car instances, I had to implement the "less than" (_lt_) method by myself.
Example with dataclass decorator:
from dataclasses import dataclass
@dataclass(order=True)
class Car:
color: str
manufacturer: str
top_speed_km: int
slow_tesla = Car(top_speed_km=261, color='white', manufacturer='Tesla')
print(slow_tesla) # Car(color='white', manufacturer='Tesla', top_speed_km=261)
fast_tesla = Car(top_speed_km=280, color='white', manufacturer='Tesla')
print(slow_tesla < fast_tesla) # True
It's necessary to set order=True
if we want special order methods implementation to be included in the dataclass (e.g. lt)
- When we try to print the
slow_tesla
object, we see the actual values of the object, not the object's address, unlike the previous example. - We can compare two objects without any need for us to implement special methods.
Inheritance
Same as regular python classes, inheritance can come to our advantage here too, no need to deal with the parent class construction:
from dataclasses import dataclass
@dataclass
class Car:
color: str
manufacturer: str
top_speed_km: int
@dataclass
class ElectricCar(Car):
battery_capacity_kwh: int
maximum_range_km: int
white_tesla_model_3 = ElectricCar(color='white', manufacturer='Tesla', top_speed_km=261, battery_capacity_kwh=50, maximum_range_km=455)
print(white_tesla_model_3)
# ElectricCar(color='white', manufacturer='Tesla', top_speed_km=261, battery_capacity_kwh=50, maximum_range_km=455)
Just for reference, here is how it will look like using a regular class:
class Car:
color: str
manufacturer: str
top_speed_km: int
def __init__(self, color: str, manufacturer: str, top_speed_km: int):
self.color = color
self.manufacturer = manufacturer
self.top_speed_km = top_speed_km
class ElectricCar(Car):
battery_capacity_kwh: int
maximum_range_km: int
def __init__(self, color: str, manufacturer: str, top_speed_km: int, battery_capacity_kwh: int, maximum_range_km: int):
super().__init__(color, manufacturer, top_speed_km)
self.battery_capacity_kwh = battery_capacity_kwh
self.maximum_range_km: maximum_range_km
white_tesla_model_3 = ElectricCar(color='white', manufacturer='Tesla', top_speed_km=261, battery_capacity_kwh=50, maximum_range_km=455)
print(white_tesla_model_3)
I hope you can see that we saved much boilerplate code even in this small code snippet and didn't repeat every parameter initiation.
Frozen Instances
By passing frozen=True
to the data class decorator, it lets us create immutable Python objects.
from dataclasses import dataclass
@dataclass(frozen=True)
class Car:
color: str
manufacturer: str
top_speed_km: int
white_tesla = Car(color='white', manufacturer='Tesla', top_speed_km=261)
white_tesla.color = 'Red'
Trying to modify white_tesla
to a red tesla, will give us a FrozenInstanceError error message:
dataclasses.FrozenInstanceError: cannot assign to field 'color'
Note: Using Frozen Instances will hurt the performance a bit so use it carefully
Now, you:
- Are familiar with what data class is and how to use it.
- Learned about the benefits and use cases of data classes.
- Have some code examples which can help you get started.
- Can start using it in your projects.
Conclusion
dataclasses
is a powerful module that helps us, Python developers, model our data, avoid writing boilerplate code, and write much cleaner and elegant code.
I encourage you to explore and learn more about data class special features, I use it in all of my projects, and I recommend you to do it too.
Extra Resources
https://docs.python.org/3/library/dataclasses.html
https://realpython.com/python-data-classes/
Top comments (4)
Cool Post, thanks alot.
Love it! Thanks for sharing!
Great article! Thx a lot! Haven’t been coding with Python for quite some time but I’ll check that out soon! I’m always happy to reduce boilerplate :).
Glad to know you found this blog post to be helpful :)