DEV Community

Cover image for A Comprehensive Guide to Slicing in Python
Bas codes
Bas codes

Posted on • Updated on • Originally published at bas.codes

A Comprehensive Guide to Slicing in Python

In Python, some objects like strs or lists can sliced. For example, you can get the first element of a list or a string with

my_list = [1,2,3]
print(my_list[0]) # 1

my_string = "Python"
print(my_string[0]) # P
Enter fullscreen mode Exit fullscreen mode

Python uses square brackets ([ and ]) to access single elements of objects that can be decomposed into parts.

However, there is more to the inside of these square brackets than just accessing individual elements:

Negative Indexing

Perhaps you already know that you can use negative indices in Python like so:

my_list = list("Python")
print(my_list[-1])
Enter fullscreen mode Exit fullscreen mode

Something like my_list[-1] represents the last element of a list, my_list[-2] represents the second last element and so on.

The Colon

What if you want to retrieve more than one element from a list? Say you want everything from start to end except for the very last one. In Python, no problemo:

my_list = list("Python")
print(my_list[0:-1])
Enter fullscreen mode Exit fullscreen mode

Or, what if you want every even element of your list, i.e. element 0, 2, etc.?
For this we would need to go from the first element to the last element but skip every second item. We could write that as:

my_list = list("Python")
print(my_list[0:len(my_list):2]) # ['P', 't', 'o']
Enter fullscreen mode Exit fullscreen mode

The slice Object

Behind the scenes, the index we use to access individual items of a list-like object consists of three values: (start, stop, step). These objects are called slice objects and can be manually created with the built-in slice function.

We can check if the two are indeed the same:

my_list = list("Python")
start = 0
stop = len(my_list)
step = 2
slice_object = slice(start, stop, step)
print(my_list[start:stop:step] == my_list[slice_object]) # True
Enter fullscreen mode Exit fullscreen mode

Python slices

Have a look at the graphic above. The letter P is the first element in our list, thus it can be indexed by 0 (see the numbers in the green boxes). The list has a length of 6, and therefore, the first element can alternatively be indexed by -6 (negative indexing is shown in the blue boxes).

The numbers in the green and blue boxes identify single elements of the list. Now, look at the numbers in the orange boxes. These determine the slice indices of the list. If we use the slice's start and stop, every element between these numbers is covered by the slice. Some examples:

"Python"[0:1] # P
"Python"[0:5] # Pytho
Enter fullscreen mode Exit fullscreen mode

That's just an easy way to remember that the start value is inclusive and the end value is exclusive.

Sane defaults

Most of the time, you want to slice your list by

  • starting at 0
  • stopping at the end
  • stepping with a width of 1

Therefore, these are the default values and can be omitted in our : syntax:

print(my_list[0:-1] == my_list[:-1])
print(my_list[0:len(my_list):2] == my_list[::2])
Enter fullscreen mode Exit fullscreen mode

Technically, whenever we omit a number between colons, the omitted ones will have the value of None.

And in turn, the slice object will replace None with

  • 0 for the start value
  • len(list) for the stop value
  • 1 for the step value

However, if the step value is negative, the Nones are replaced with

  • -1 for the start value
  • -len(list) - 1 for the stop value

For example, "Python"[::-1] is technically the same as "Python"[-1:-7:-1]

Special Case: Copy

There is a special case for slicing which can be used as a shortcut, sometimes:

If you use just the default values, i.e. my_list[:] it will give you the exact same items:

my_list = list("Python")
my_list_2 = my_list[:]
print(my_list==my_list_2)
Enter fullscreen mode Exit fullscreen mode

The elements in the list are indeed the same. However, the list object is not. We can check that by using the id builtin:

print(id(my_list))
print(id(my_list_2))
Enter fullscreen mode Exit fullscreen mode

Note that every slice operation returns a new object. A copy of our sequence is created when using just [:].

Here are two code snippets to illustrate the difference:

a = list("Python")
b = a
a[-1] = "N"
print(a)
# ['P', 'y', 't', 'h', 'o', 'N']
print(b)
# ['P', 'y', 't', 'h', 'o', 'N']
Enter fullscreen mode Exit fullscreen mode
a = list("Python")
b = a[:]
a[-1] = "N"
print(a)
# ['P', 'y', 't', 'h', 'o', 'N']
print(b)
# ['P', 'y', 't', 'h', 'o', 'n']
Enter fullscreen mode Exit fullscreen mode

Examples

Some often used examples:

Use case Python Code
Every element no slice, or [:] for a copy
Every second element [::2] (even) or [1::2] (odd)
Every element but the first one [1:]
Every element but the last one [:-1]
Every element but the first and the last one [1:-1]
Every element in reverse order [::-1]
Every element but the first and the last one in reverse order [-2:0:-1]
Every second element but the first and the last one in reverse order [-2:0:-2]

Assignments

p = list("Python")
# ['P', 'y', 't', 'h', 'o', 'n']
p[1:-1]
# ['y', 't', 'h', 'o']
p[1:-1] = 'x'
print(p)
['P', 'x', 'n']

p = list("Python")
p[1:-1] = ['x'] * 4
p
# ['P', 'x', 'x', 'x', 'x', 'n']
Enter fullscreen mode Exit fullscreen mode

Understanding the loop

Every slice object in Python has an indices method. This method will return a pair of (start, end, step) with which you could rebuild a loop equivalent to the slicing operation.
Sounds complicated? Let's have a closer look:

Let's start with a sequence:

sequence = list("Python")
Enter fullscreen mode Exit fullscreen mode

Then, we create a slice object. Let's take every second element, i.e. [::2].

my_slice = slice(None, None, 2) # equivalent to `[::2]`.
Enter fullscreen mode Exit fullscreen mode

Since we're using Nones, the slice object needs to calculate the actual index values based on the length of our sequence. Therefore, to get our index triple, we need to pass the length to the indices method, like so:

indices = my_slice.indices(len(sequence))
Enter fullscreen mode Exit fullscreen mode

This will give us the triple (0, 6, 2). We now can recreate the loop like so:

sequence = list("Python")
start = 0
stop =  6
step =  2
i = start
while i != stop:
    print(sequence[i])
    i = i+step
Enter fullscreen mode Exit fullscreen mode

This accesses the same elements of our list as the slice itself would do.

Making Own Classes Sliceable

Python wouldn't be Python if you could not use the slice object in your own classes.
Even better, slices do not need to be numerical values. We could build an address book which sliceable by alphabetical indices.

import string
class AddressBook:
    def __init__(self):
        self.addresses = []

    def add_address(self, name, address):
        self.addresses.append((name, address))

    def get_addresses_by_first_letters(self, letters):
        letters = letters.upper()
        return [(name, address) for name, address in self.addresses if any(name.upper().startswith(letter) for letter in letters)]

    def __getitem__(self, key):
        if isinstance(key, str):
            return self.get_addresses_by_first_letters(key)
        if isinstance(key, slice):
            start, stop, step = key.start, key.stop, key.step
            letters = (string.ascii_uppercase[string.ascii_uppercase.index(start):string.ascii_uppercase.index(stop)+1:step])
            return self.get_addresses_by_first_letters(letters)

address_book = AddressBook()
address_book.add_address("Sherlock Holmes",       "221B Baker St., London")
address_book.add_address("Wallace and Gromit",    "62 West Wallaby Street, Wigan, Lancashire")
address_book.add_address("Peter Wimsey",          "110a Piccadilly, London")
address_book.add_address("Al Bundy",              "9764 Jeopardy Lane, Chicago, Illinois")
address_book.add_address("John Dolittle",         "Oxenthorpe Road, Puddleby-on-the-Marsh, Slopshire, England")
address_book.add_address("Spongebob Squarepants", "124 Conch Street, Bikini Bottom, Pacific Ocean")
address_book.add_address("Hercule Poirot",        "Apt. 56B, Whitehaven Mansions, Sandhurst Square, London W1")
address_book.add_address("Bart Simpson",          "742 Evergreen Terrace, Springfield, USA")


print(string.ascii_uppercase)
print(string.ascii_uppercase.index("A"))
print(string.ascii_uppercase.index("Z"))


print(address_book["A"])
print(address_book["B"])
print(address_book["S"])
print(address_book["A":"H"])
Enter fullscreen mode Exit fullscreen mode

Explanation

The get_addresses_by_first_letters method

    def get_addresses_by_first_letters(self, letters):
        letters = letters.upper()
        return [(name, address) for name, address in self.addresses if any(name.upper().startswith(letter) for letter in letters)]
Enter fullscreen mode Exit fullscreen mode

This method filters all addresses belonging to a name starting with any letter in the letters argument. First, we make the function case insensitive by converting our letters to uppercase. Then, we use a list comprehension over our internal addresses list. The condition inside the list comprehension tests if any of the provided letters matches the first letter of the corresponding name value.

The __getitem__ method

To make our AddressBook objects sliceable, we need to overwrite Python's magic double underscore method __getitem__.

    def __getitem__(self, key):
        if isinstance(key, str):
            return self.get_addresses_by_first_letters(key)
        if isinstance(key, slice):
            start, stop, step = key.start, key.stop, key.step
            letters = (string.ascii_uppercase[string.ascii_uppercase.index(start):string.ascii_uppercase.index(stop)+1:step])
            return self.get_addresses_by_first_letters(letters)
Enter fullscreen mode Exit fullscreen mode

At first, we check if our key is a str. This will be the case if we access our object with a single letter in square brackets like so: address_book["A"]. We can just return any addresses whose name starts with the given letter for this trivial case.

The interesting part is when the key is a slice object. For example, an access like address_book["A":"H"] would match that condition.
First, we identify all letters alphabetically between A and H. The string module in Python lists all (latin) letters in in string.ascii_uppercase. We use a slice to extract the letters between the given letters. Note the +1 in the second slice parameter. This way, we ensure that the last letter is inclusive, not exclusive.

After we determined all letters in our sequence, we use the get_addresses_by_first_letters, which we already discussed. This gives us the result we want.

Discussion (1)

Collapse
krn238 profile image
krn238

Thank you!