Python data model
Why len and not list.length?
This is a recurrent question when it comes to writing Python code, and if you are familiar with OOP, you probably asked the same question before.
To talk about this, we should take a look into special methods some other languages, like Ruby, call them magic methods.
These methods are not intended to be invoked directly by the user; instead, the interpreter invokes them.
As a result, we can deduce that the truth of the matter is len(sequence) built-in function calls sequence.__len__() behind the scenes.
That being said, our question remains unanswered, but actually, Python has a pretty good explanation for this behavior.
A user defined __len__ method to calculate the length of a sequence might be very custom, and vary from user to user.
But __len__ method of Python built-in types, like list, tuple, str or any other sequence is more complicated, and faster, than that.
Since Python is written in C, invoking len function with a built-in sequence type, Python, rather than iterating over the sequence and count how many items are inside it, looks for a C data structure field called ob_size, which is way faster.
This is because, __len__ method of the built-in sequences is written to do so.
We can conclude, Python is quite flexible; therefore, we are free to create any custom type, customizing each special method of our classes as much as we need.
In this article, I will show you how defining a couple of special methods, you will be able to create classes which behaves as any built-in Python sequence.
Writing a Playlist class
To show you, how to create a Playlist class which will behave as a Python builtin sequence we will create a new Song namedtuple , after all, what is a Playlist without songs?
Why a namedtuple and not a class?
Since my Song is only a “vessel” that don’t require any special logic and its data will remain the same during all the program execution, a namedtuple is a better option to use, they also have a nice human-readable string representation, which is easier to read when we use print.
from collections import namedtuple
song_attrs = ("name", "album", "artist")
Song = namedtuple("Song", song_attrs)
Ok, now our Song type is done, then, let’s start with the Playlist class.
from typing import List
class Playlist:
def __init__(self):
self.__songs: List[Song] = []
So far, we created a Playlist class which has a songs private attribute, this attribute is specifically a list of Song instances.
The next step is to implement a __len__ and a __getitem__ methods as well, Python does not use interfaces, but it actually requires implementing at least those two methods in order to considere your class as a sequence.
from typing import List
class Playlist:
def __init__(self):
self.__songs: List[Song] = []
@property
def songs(self) -> List[Song]:
return self.__songs
def __len__(self) -> int:
return len(self.songs)
def __getitem__(self, index: int) -> Song:
return self.songs[index]
def add_song(self, song: Song):
self.songs.append(song)
In the above’s example, we defined a property which is actually optional, this is an equivalent of a getter, the advantage of using properties is, since we didn’t define a setter trying to set the songs property will raise an exception rather than allowing the user to override the full attribute, making it “read-only”.
On the other hand, we defined a __len__ method, which actually returns the length of the list of songs, then we can now use the next snippet.
my_playlist = Playlist()
album = "Soft Sounds"
artist = "Delta Sleep"
names = "Strongthany,Dotwork,Camp adventure,Sans Soleil"
for name in names.split(","):
my_playlist.add_song(Song(name, album, artist))
len(my_playlist)
# Out: 4
But we defined a __getitem__ as well, this method, “unlocks” the sequence[index] syntax; therefore, we can access any of our songs by index.
my_playlist[0].name
# Out: Strongthany
my_playlist[-1]
# Out: Song(name='Sans Soleil', album='Soft Sounds', artist='Delta Sleep')
But that’s not all! It also “unlocks” slicing making us able to use the following syntax:
my_playlist[1:3]
# Out: [Song(name='Dotwork', album='Soft Sounds', artist='Delta Sleep'), Song(name='Camp adventure', album='Soft Sounds', artist='Delta Sleep')]
my_playlist[::-1]
# Out: [Song(name='Sans Soleil', album='Soft Sounds', artist='Delta Sleep'), Song(name='Camp adventure', album='Soft Sounds', artist='Delta Sleep'), Song(name='Dotwork', album='Soft Sounds', artist='Delta Sleep'), Song(name='Strongthany', album='Soft Sounds', artist='Delta Sleep')]
This ability, allows us to iterate over our sequence using a for sentence, as if the Playlist instances were built-in list instances, this is because, we actually delegated all the sequence implementation to our __songs attribute.
for song in my_playlist:
print(song.name, "by ", song.artist)
# Out:
# Strongthany by Delta Sleep
# Dotwork by Delta Sleep
# Camp adventure by Delta Sleep
# Sans Soleil by Delta Sleep
Although, no, it is not all yet! Many of the Python built-in modules, require a sequence to work, and, since our Playlist class is technically a sequence, we can use them with our instances.
for song in reversed(my_playlist):
print(song.name, "by ", song.artist)
# Out:
# Sans Soleil by Delta Sleep
# Camp adventure by Delta Sleep
# Dotwork by Delta Sleep
# Strongthany by Delta Sleep
from random import choice
choice(my_playlist)
# Out: Song(name='Dotwork', album='Soft Sounds', artist='Delta Sleep')
Conclusion
Python data model provides much flexibility, I haven’t worked (yet) with other language that actually allows you to emulate its built-in types, Go slices are useful, yes, but you cannot create a custom structure that fulfills your program necessities and behave as a built-in slice.
As you might have guessed, yes, you can emulate any other built-in type in Python.
For this article, I based myself on “Fluent Python, 2nd Edition” a great book written by Luciano Ramalho one of my biggest influences to become a Pythonista, please take a look into the book! You’ll learn many new things, just as I did.
Top comments (0)