Like the articles? Buy the book! Dead Simple Python by Jason C. McDonald is available from No Starch Press.
The previous section more or less ended on a cliffhanger.
When we last left our hero, we'd discovered loops and iterators in Python, and tasted a bit of the potential they offer.
I got a few choice words from my editors: "you left out zip()
and enumerate()
, and those are surely very important to any discussion on iterators!" Yes, they are, but the article was getting a bit long. Never fear, though - we're about to tackle them, and many more!
By the way, if you haven't read the previous section, "Loops and Iterators" you'll want to go back and do that now! Don't worry, I'll wait.
It may seem strange to dedicate an entire article to a handful of built-in functions, but these contribute a lot of magic to Python.
Revisiting range
Remember the range()
function from the previous article? We briefly covered how it could be used to generate a sequence of numbers, but it has more power than it seems to at first glance.
Start and Stop
The first hurdle to using range()
is understanding the arguments: range(start, stop)
. start
is inclusive; we start on that actual number. stop
, however, is exclusive, meaning we stop before we get there!
So, if we have range(1, 10)
, we get [1, 2, 3, 4, 5, 6, 7, 8, 9]
. We start on 1
, but we never actually get to 10
; we stop one short.
If we wanted to include 10
in our sequence, we'd need range(1, 11)
: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
.
By the way, if we only specify one argument, like range(10)
, it will assume the start of the range is 0
. In this case, we'd get [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
. You'll see range()
used in this manner quite often when it is used to control a traditional for loop.
Skipping Along
My favorite trick with range()
is its optional third argument: when you specify range(start, stop, step)
, the step
argument allows you to increment by values greater than 1
.
One use might be to print out all the multiples of 7, from 7
itself to 700
, inclusively: range(7, 701, 7)
would do just that. (Take note, I specified 701
for the end, to ensure 700
would be included.)
Another use might be to print all odd numbers less than 100: range(1, 100, 2)
.
Storing Ranges
If you're trying ranges out, you'll probably notice that this doesn't do what you expect:
sevens = range(7, 701, 7)
print(sevens)
The print command prints the literal phrase range(7, 701, 7)
. That's not what we wanted!
Remember, range()
returns an object that is like an iterator (but isn't exactly). To store that as a list outright, we'd need to explicitly turn it into a list, by wrapping it in the list()
function:
sevens = list(range(7, 701, 7))
print(sevens)
Now that output is what we wanted - a list of the first hundred multiples of 7!
Slicing
Before we jump into all this new iteration goodness, I want to introduce you to extended indexing notation, which allows us to more powerfully select elements from an ordered container, such as a list.
Speaking of lists, let's put one together:
dossiers = ['Sir Vile', 'Dr. Belljar', 'Baron Grinnit', 'Medeva', 'General Mayhem', 'Buggs Zapper', 'Jacqueline Hyde', 'Jane Reaction', 'Dee Cryption']
Whether you realize it or not, you already know normal index notation.
print(dossiers[1])
>>> Dr. Belljar
That returned the second element (index 1
) of the dossiers
container. Simple enough, right? Virtually all languages offer this behavior.
So, what if we want the second and third elements?
print(dossiers[1:3])
>>> ['Dr. Belljar', 'Baron Grinnit']
What just happened? In extended indexing notation, we have three arguments, separated by colons: start
, stop
, and step
. Hey, sound familiar? It should - those are the same arguments that range()
uses! They work exactly the same way, too. (Of course, we left off the third argument [step
] in the example above.)
Take note, that example printed out Dr. Belljar
(index 1
) and Baron Grinnit
(index 2
), but not Medeva
, because the stop
argument is exclusive; we stop just short of it.
Do take note, start
must be less than stop
for you to get any results! There is an exception, though, which we'll talk about shortly.
Now, what if you wanted every other dossier, starting with the second one?
print(dossiers[1::2])
>>> ['Dr. Belljar', 'Medeva', 'Buggs Zapper', 'Jane Reaction']
You'll notice that we didn't specify a stop
. We actually didn't need to! Extended indexing notation allows you to leave out any argument, so long as you have the colons to separate everything. Since the second argument was omitted, we just put the extra :
after where it would have been.
Going Backwards
Extended indexing notation takes the (start, stop, step)
logic one step further, by allowing you to work BACKWARDS! This is a bit of a brain twister at first, though, so hang on tight...
print(dossiers[-1])
That prints out the last item in the list. Negative numbers start counting from the end of the list! This can feel a little weird, since we're used to counting from 0
in indexes, but negative zero isn't really a thing, so we start with -1
.
Given that, how do we print the last three items? We might try this, but it won't actually work....
print(dossiers[-1:-4])
>>> []
That returns an empty list. Why? Remember, start
must be less than stop
, even when working with negative indices. So, we have to put -4
as our start
, since -4 < -1
.
print(dossiers[-4:-1])
>>> ['Buggs Zapper', 'Jacqueline Hyde', 'Jane Reaction']
That's closer, but there's still a problem. Dee Cryption
is our last item, so where is she? Remember, stop
is exclusive; we stop just shy of it. But we can't just say dossiers[-4]
, since that'll only give us Buggs Zapper
. And dossiers[-4:-0]
isn't valid.
The way to solve this is to tell Python we are explicitly omitting the second argument: put a colon after our first argument!
print(dossiers[-4:])
>>> ['Buggs Zapper', 'Jacqueline Hyde', 'Jane Reaction', 'Dee Cryption']
Great, now we see to the end, except now we have too much information. We want the last three, so let's change -4
to -3
...
print(dossiers[-3:])
>>> ['Jacqueline Hyde', 'Jane Reaction', 'Dee Cryption']
Thar she blows!
Speaking of magic, what do you suppose would happen if we put a negative number in the third argument, step
? Let's try -1
, with two colons preceding it, to indicate we want the whole list.
print(dossiers[::-1])
>>> ['Dee Cryption', 'Jane Reaction', 'Jacqueline Hyde', 'Buggs Zapper', 'General Mayhem', 'Medeva', 'Baron Grinnit', 'Dr. Belljar', 'Sir Vile']
Hey, that prints everything backwards! Indeed, a step
of -1
reverses the list.
Now let's try -2
...
print(dossiers[::-2])
>>> ['Dee Cryption', 'Jacqueline Hyde', 'General Mayhem', 'Baron Grinnit', 'Sir Vile']
Not only did that reverse the list, but it skipped every other element. A negative step
behaves exactly like a positive step
, except it works backwards!
So, what if we wanted to put everything together? Perhaps we want to list the second, third, and fourth elements in reverse order...
print(dossiers[2:5:-1])
>>> []
Gotcha Alert: start
and stop
must be in the order of traversal. If step
is positive, start
must be less than stop
; however, if step
is negative, start
must be greater than stop
!
You can think of it like walking directions for a photo tour. step
tells you which way to walk, and how big your stride should be. You start taking photos once you reach start
, and as soon as you encounter stop
, you put your camera away.
So, to fix that, we need to swap our start
and stop
.
print(dossiers[5:2:-1])
>>> ['Buggs Zapper', 'General Mayhem', 'Medeva']
Side Note: Python also provides the slice()
and itertools.islice()
functions, which behave in much the same way. However, they're both more limited than the extended indexing notation, so you're almost always best off using that instead of the functions.
Playing With Iterables
The rest of the functions we'll be exploring in this section work with iterables. While I'll use lists for most examples, remember that you can use any iterable with these, including the range()
function.
all
and any
Imagine you got a whole bunch of data, such as a list of hundreds of names, in an iterable container, such as a list. Before you feed that list into your super brilliant algorithm, you want to save some processing time by checking that you actually have some string value in every single element, no exceptions.
This is what the all
function is for.
dossiers = ['Sir Vile', 'Dr. Belljar', 'Baron Grinnit', 'Medeva', 'General Mayhem', 'Buggs Zapper', '', 'Jane Reaction', 'Dee Cryption']
print(all(dossiers))
>>> False
You may recall, an empty string (''
) evaluates to False
in Python. The all()
function evaluates each element, and ensures it returns True
. If even one evaluates to False
, the all()
function will also return false.
any()
works in almost the same way, except it only requires a single element to evaluate to True
.
These may not seem terribly useful at first blush, but when combined with some of the other tools, or even with list comprehensions (later section), they can save a lot of time!
enumerate
Within a loop, if you need to access both the values of a list and their indices, you can do that with the enumerate()
function.
foo = ['A', 'B', 'C', 'D', 'E']
for index, value in enumerate(foo):
print(f'Element {index} is has the value {value}.')
enumerate()
isn't limited to lists, however. Like all these other functions, it works on any iterable, numbering (or enumerating) each of the values returned. For example, we can use it on range()
. Let's use it to print out every multiple of 10 from 10 to 100 (range(10,101,10)
). We'll enumerate that...
for index, value in enumerate(range(10,101,10)):
print(f'Element {index} is has the value {value}.')
That gives us...
Element 0 is has the value 10.
Element 1 is has the value 20.
Element 2 is has the value 30.
Element 3 is has the value 40.
Element 4 is has the value 50.
Element 5 is has the value 60.
Element 6 is has the value 70.
Element 7 is has the value 80.
Element 8 is has the value 90.
Element 9 is has the value 100
Hmm, rather interesting. We could make a neat pattern out of this, but we'd have to start the enumeration at 1, instead of 0. Sure enough, we can do that by passing the starting count as the second argument. We'll also tweak our message a bit, just to take advantage of the pattern to do something kinda neat.
for index, value in enumerate(range(10,101,10), 1):
print(f'{index} times 10 equals {value}')
When we run that, we get...
1 times 10 equals 10
2 times 10 equals 20
3 times 10 equals 30
4 times 10 equals 40
5 times 10 equals 50
6 times 10 equals 60
7 times 10 equals 70
8 times 10 equals 80
9 times 10 equals 90
10 times 10 equals 100
filter
Let's imagine we're tracking the number of clues we find at a bunch of locations, perhaps storing them in a dictionary. I'll borrow and tweak a dictionary from the last section for this example...
locations = {
'Parade Ground': 0,
'Ste.-Catherine Street': 0,
'Pont Victoria': 0,
'Underground City': 3,
'Mont Royal Park': 0,
'Fine Arts Museum': 0,
'Humor Hall of Fame': 2,
'Lachine Canal': 4,
'Montreal Jazz Festival': 1,
'Olympic Stadium': 0,
'St. Lawrence River': 2,
'Old Montréal': 0,
'McGill University': 0,
'Chalet Lookout': 0,
'Île Notre-Dame': 0
}
Perhaps we need to find all the locations that have clues, and ignore the rest. We'll start by writing a function to test a particular key-value tuple pair. This may seem like a ridiculous overcomplication, but it will make sense in a moment:
def has_clues(pair):
return bool(pair[1])
We'll be submitting each pair from the dictionary to the function as a tuple, so pair[1]
will be the value (e.g. ('Underground City', 3)
). The built-in function bool()
will return False
if the number is 0
, and True
for everything else, which is exactly what we want.
We use the filter()
function to narrow down our dictionary, using that function we just wrote. Recall from the last section, we need to use locations.items()
to get both the keys and values as pairs.
for place, clues in filter(has_clues, locations.items()):
print(place)
Take note, we don't include the parenthesis after has_clues
. We are passing the actual function as an object! filter
will do the actual calling.
Sure enough, running that code prints out the five places where we had clues (values > 0
)...
Underground City
Humor Hall of Fame
Lachine Canal
Montreal Jazz Festival
St. Lawrence River
Later in this series, we'll learn about lambdas, anonymous functions that will allow us to do away with the extra function altogether. As a preview, here's what that would actually look like...
for place, clues in filter(lambda x:bool(x[1]), locations.items()):
print(place)
map
map()
functions in a similar way to filter()
, except instead of using the function to omit elements from the iterable, it is used to change them.
Let's imagine we have a list of temperatures in Fahrenheit:
temps = [67.0, 72.5, 71.3, 78.4, 62.1, 80.6]
We want to convert those all to Celsius, so we write a function for that.
def f_to_c(temp):
return round((temp - 32) / 1.8, 1)
We can use the map()
function to apply that to each value in temps
, producing an iterator we can use in a loop (or anywhere).
for c in map(f_to_c, temps):
print(f'{c}°C')
Remember, we're passing the function object f_to_c
as the first argument of map()
, so we leave the parenthesis off!
Running that loop gives us:
19.4°C
22.5°C
21.8°C
25.8°C
16.7°C
27.0°C
min
and max
Let's keep working with those temperatures for a moment. If we wanted to find the lowest or the highest in the list, we could use the min()
or max()
functions, respectively. Not much to this, really.
temps = [67.0, 72.5, 71.3, 78.4, 62.1, 80.6]
print(min(temps))
>>> 62.1
print(max(temps))
>>> 80.6
Side Note: Unrelated to iterables, you can also use those functions to find the smallest or largest of a list of arguments you give it, such as min(4, 5, 6, 7, 8)
, which would return 4
.
sorted
Often, you'll want to sort an iterable. Python does this very efficiently through the sorted()
built-in function.
temps = [67.0, 72.5, 71.3, 78.4, 62.1, 80.6]
for t in sorted(temps):
print(t)
That produces...
62.1
67.0
71.3
72.5
78.4
80.6
reversed
Most of the time, the extended indexing notation [::-1]
will allow you to reverse a list or other ordered iterable. But if that's not an option, you can also use the reversed()
function.
For example, I'll combine it with the sorted()
function from a moment ago...
temps = [67.0, 72.5, 71.3, 78.4, 62.1, 80.6]
for t in reversed(sorted(temps)):
print(t)
That gives us...
80.6
78.4
72.5
71.3
67.0
62.1
sum
Another quick built-in function is sum()
, which adds all of the elements in the iterable together. Naturally, this only works if all the elements can be added together.
One use of this would be in finding an average of those temperatures earlier. You may recall that the len()
function tells us how many elements are in a container.
temps = [67.0, 72.5, 71.3, 78.4, 62.1, 80.6]
average = sum(temps) / len(temps)
print(round(average, 2))
>>> 71.98
zip
Remember that earlier example about the locations and clues? Imagine we got that information, not in a dictionary, but in two lists:
locations = ['Parade Ground', 'Ste.-Catherine Street', 'Pont Victoria', 'Underground City', 'Mont Royal Park', 'Fine Arts Museum', 'Humor Hall of Fame', 'Lachine Canal', 'Montreal Jazz Festival', 'Olympic Stadium', 'St. Lawrence River', 'Old Montréal', 'McGill University', 'Chalet Lookout', 'Île Notre-Dame']
clues = [0, 0, 0, 3, 0, 0, 2, 4, 1, 0, 2, 0, 0, 0, 0]
Yuck! That's not fun to work with, although there are certainly real world scenarios where we would get data in this fashion.
Thankfully, the zip()
function can help us make sense of this data by aggregating it into tuples using an iterator, giving us (locations[0], clues[0]), (locations[1], clues[1]), (locations[2], clues[2])
and so on.
The zip()
function isn't even limited to two iterables; it can zip together as many as we give it! If the iterables don't all have the same length, any "extras" will hang out at the end.
Of course, in this case, the two lists are the same length, so the results are rather obvious. Let's create a new list using the data from zip, and print it out.
data = list(zip(locations, clues))
print(data)
That gives us a structure not unlike what we got from the dictionary's .items()
function earlier!
[('Parade Ground', 0), ('Ste.-Catherine Street', 0), ('Pont Victoria', 0), ('Underground City', 3), ('Mont Royal Park', 0), ('Fine Arts Museum', 0), ('Humor Hall of Fame', 2), ('Lachine Canal', 4), ('Montreal Jazz Festival', 1), ('Olympic Stadium', 0), ('St. Lawrence River', 2), ('Old Montréal', 0), ('McGill University', 0), ('Chalet Lookout', 0), ('Île Notre-Dame', 0)]
In fact, if I recall my filter()
function with the lambda, I can tweak it to use zip
, letting us work purely from the two lists:
for place, clues in filter(lambda x:bool(x[1]), zip(locations, clues)):
print(place)
As before, that outputs...
Underground City
Humor Hall of Fame
Lachine Canal
Montreal Jazz Festival
St. Lawrence River
itertools
I've covered virtually all of Python's built-in functions for working with iterables, but there are still many more to be had in the itertools
module. I strongly recommend reading the documentation to learn more.
Review
This section has been a bit more encyclopedic in nature than the rest of the series, but I hope it's given you an appreciation for some of the incredible things you can do with your new iterator skills.
If you're still waiting on those long-promised generators and list comprehensions, never fear! They're coming up in the very next sections.
As always, I recommend that you read the documentation:
Top comments (13)
I find it fairly easy to intuit how range works, because mostly I've used it for
range(len(something))
(alternativelyrange(1,len(something))
) in which case it grabs every index ofsomething
. Like, iflen(something) == 10
, then the last item insomething
has an index of 9, so it makes sense that range wouldn't want to try to grab an index 10.Sometimes this throws me off trying to use actual numbers in
range()
, though XDFor some reason I often have trouble confusing the syntax of slices and ranges, though. Less as I practice more, but it's not uncommon for my first entry of a statement to have
:
where it should have,
, or vice versa. Oops.(Also, stepping by -1 is fun ;) )
Such a great series, just what I need to get started with Python again! I am slightly confused around the enumeration examples though, the first snippet and second snippet read the same other than the output text, should the first snippet read as below with a 0 rather than a 1 on the enumeration?
Cheers
Marc
You could do that. I deliberate used a
1
because I wanted to use 1-based indexing in my output, instead of 0-based. I personally liked it better, that's all.If you want to start from
0
, as in your code, you can actually omit the second argument altogether, and just useenumerate(range(10,101,10))
.That makes sense, however in order to get the result provided in the post for your first snippet, you would have needed to have put a 0 or omitted the second argument, which is what I found confusing as the second snippet then shows the correct results for using a 1.
Cheers
Marc
Oh! Derp, I see it now. Sorry, typo! Fixed.
No worries! Just glad it wasn’t me not getting it.
Keep up the good work, this is a great series!
Cheers
Marc
Great article! Just a heads up, I found a typo when you were talking about reversed()
Awesome! What was the typo? Do you remember?
He may be referring to the end of the first paragraph in the section on
reversed()
: you wrote "reserved" instead. I was going to point out the same thing.Good catch! Thanks.
There are typos in some
print
's — you forgot to close)
brackets.Thanks for your huge work, it helps me a lot!
Oh, thanks for catching that!