Florian Rohrer

Posted on Sep 12, 2017

What is the most confusing thing to you in Python?

#discuss #python #beginners

Just some traps and pitfalls that took me a while to get used to in Python.

List methods vs. built-in methods

# sort(), sorted(), reverse(), reversed()
# sort() modifies the existing list and is a function of the list object
# sorted() returns a new list and is a built-in function

a = [5,2,8,1,9,3]
sorted(a) # ok -> [1, 2, 3, 5, 8, 9]
a.sort()  # ok -> [1, 2, 3, 5, 8, 9]
a.sorted() # ERROR

# however, min() and max() are only built-in functions
a.min() # ERROR
min(a) # ok -> 1

# unless you work with numpy, where min() and max() do exist :S
import numpy as np
n = np.array([5,2,4,9,1,7])
n.min() # ok -> 1

string.join()

# I have to call .join() on a string to concatenate a list = confusing.

# concatenating a list of strings
", ".join(["apple", "banana", "cherry"]) # 'apple, banana, cherry'

# even uglier, when list is not a string
", ".join([str(s) for s in [1,2,3,4]])   # '1, 2, 3, 4'

List of lists


# The wrong way
a = [[]] * 5
# [[], [], [], [], []]
a[0].append(1)
# [[1], [1], [1], [1], [1]]

# The right way
a = [[] for _ in range(5)]

'is' and '==' on ints

# Internally some int objects are cached. When you compare 2 ints, you should use ==. 
# However, using 'is' can also work in some cases.

x = 2000
y = 2000
x is y # False -> good, because this should not be allowed anyways.

x = 5
y = 5
x is y # True -> But this works....
# See also
# https://stackoverflow.com/questions/306313/is-operator-behaves-unexpectedly-with-integers

Did you find any interesting or funny things in Python, that confused you the first time you saw them? Share them in the comments below.

Top comments (17)

Ethan Turkeltaub • Sep 14 '17 • Edited

Oh my god, the import/package system.

My senior thesis project was a Python project, and I was coming from the Ruby world. Dear lord, I never got used to the import system. The fact that you have to throw in something like this to import anything local astounded me:

import os
import sys

sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))

import foo.bar

Circular dependencies tripped me up quite a bit too, but I can probably put that down to lack of experience in Python. But man, coming from a world where you could just do this, it was a bit of a shock:

require 'foo/bar'

Andrew • Sep 15 '17 • Edited

how about having the __init__.py file in the foo/ ?

Graham Lyons • Sep 19 '17

The path insert is fairly horrible but I don't believe it's recommended.

Setting the PYTHONPATH environment variable can be useful in a local project (e.g. export PYTHONPATH=. in the root of your project), then using packages and absolute imports if you've got a lot of modules.

You can also import local modules just by using their name e.g. import my_module to import my_module.py from the same directory.

I find the Ruby require system a bit too close to PHP for my liking. I prefer that Python and Java abstract away the file system. However, using Bundler to manage your dependencies and put them onto the Ruby path at runtime is better than anything Python has.

Tamás Szelei • Sep 15 '17

In Python3 it's a lot saner (still not perfect though)

Mathieu PATUREL • Sep 13 '17

Let's not do that with PHP or JavaScript, otherwise it'll never end :smile:

For the string.join, it makes sense, although both choice (string.join or list.join) have inconvenient.

If you look at every single str methods (.title(), .lower(), .format(), .strip(), it returns a new string. It never mutate the original one.

For the list objects, it's the opposite. It always mutates the original list, and always returns None.

So, if they implemented the .join() method on the list objects, they would have had to break this "convention". If they didn't, this would have append:

>>> lines = ['first', 'second', 'third']
>>> lines.join(' ')
>>> lines
'first second third'

(I'm not even sure it's possible change the type of an existing object)

So, the choice they made doesn't break any conventions and is logic, although it might not be the prettiest. IMHO, it's right choice. And in the zen's opinion too, I guess:

Special cases aren't special enough to break the rules.

(guess they didn't respect this one for your last example though).

Good post anyway, learned something about python's int, thanks! :+1:

Nicolas B. • Sep 14 '17 • Edited

Python does not forbid many things so try this:

class Hello:
  def __init__(self, z):
    self.z = z

  def do(self):
    print('Hello {}'.format(self.z))


class SeeYou:
  def __init__(self, z):
    self.z = z

  def do(self):
    print('See you {}'.format(self.z))

y = Hello('world')
y.do()
y.__class__ = SeeYou
y.do()

Mathieu PATUREL • Sep 15 '17

:smiley: Never thought of this before... That's cool! Thanks!

Florian Rohrer • Sep 13 '17

Wow, thank you for your elaborate response! :)

Florian Rohrer • Sep 15 '17

To add to that last one. You can actually change manipulate the integer cache using the module ctypes. Do this, if you want to get fired from you job :D

import ctypes

def mutate_int(an_int, new_value):
    ctypes.memmove(id(an_int) + 24, id(new_value) + 24, 8)

a_number = 7
another_number = 7
mutate_int(a_number, 13)
print(a_number) # 13
print(another_number) # also 13 now

Now we have replaced 7 by 13.

print(7)   # 13
print(7+1) # 14
print(7*2) # 26

Source: kate.io/blog/2017/08/22/weird-pyth...

Zack Z. • Sep 15 '17

Here's a gotcha I once saw someone post involving dictionary keys:

This program:

#!/usr/bin/env python

print {0: "o hai!", False: "kthxbai", 1: "foo", True: "bar"}
print {False: "o hai!", 0: "kthxbai", True: "foo", 1: "bar"}

Yields this output:

{0: 'kthxbai', 1: 'bar'}
{False: 'kthxbai', True: 'bar'}

It never bit me in any production code, but still, I thought it was interesting enough to share.

Graham Lyons • Sep 19 '17

When I first started writing Python I couldn't get over the lack of braces. Having written a lot of Ruby and read a lot of unreadable Perl I now think it's a great feature. Readability as a language feature!

Kajigga • Oct 20 '17

I also had a challenge with it until I realized that I basically indented this way anyways with other languages for readability. I am now to the point that all the extra stuff just feels like noise. I have to do a fair amount of JS and I feel like I have to make an effort to ignore all the brackets and semicolons.

Tamir Bahar • Sep 15 '17

I find Python's name-bindings to be especially confusing at times.
In the following code:

list_of_funcs = []
for i in range(10):
    def f():
        print i
    list_of_funcs.append(f)

for f in list_of_funcs:
    f()

The output will be:

Because the name i is bound to the loop, and not to the function. To get a sequence of numbers, we'd have to do as follows:

list_of_funcs = []
for i in range(10):
    def make_f(n):
        def f():
            print n
        return f
    list_of_funcs.append(make_f(i))

for f in list_of_funcs:
    f()

The function make_f(n) creates a new binding. A new n is bound at every call to make_f(n), so the values are kept.

Aswath KNM • Sep 13 '17 • Edited

Another thing that I learned recently

a = [1,2,3,4,5]
b = a
b.append(10)
print a #=> [1,2,3,4,5,10]

Because a and b are referencing same memory

Mathieu PATUREL • Sep 13 '17

It's the case for every single thing in python except for int, float and str. These last three ones are passed by value (meaning we copy their value), everything else is passed by reference (meaning, as you said, the pass the "reference to the memory", the value isn't duplicated).

Have a look at this:

>>> mylist = [1, 2]
>>> ref = mylist
>>> ref.append(3)
>>> mylist
[1, 2, 3]
>>> mydict = {'key': 'value'}
>>> ref = mydict
>>> ref['key2'] = 'value2'
>>> mydict
{'key2': 'value2', 'key': 'value'}
>>> class MyCustomObject:
...     def __init__(self, name):
...             self.name = name
...
>>> myobject = MyCustomObject('Mathieu')
>>> ref = myobject
>>> ref.name = 'math2001'
>>> myobject.name
'math2001'
>>>

Quick tip:

If you don't want a list to be passed by reference, you can do that by calling .copy().

>>> mylist = [1, 2]
>>> copy = mylist.copy()
>>> copy.append(3)
>>> copy
[1, 2, 3]
>>> mylist
[1, 2]

The same works with dict. And obviously, for your custom object, you have to implement it yourself.

Gotcha

This copy methods on list and dict are only shallow copies. That means you don't duplicate the values too, they are passed my reference:

>>> mylist = [[1], 3]
>>> copy = mylist.copy()
>>> copy[0]
[1]
>>> copy[0].append(2)
>>> copy
[[1, 2], 3]
>>> mylist
[[1, 2], 3]
>>>

But if you want to do a deep copy:

>>> import copy
>>> mylist = [[1], 3]
>>> deepcopy = copy.deepcopy(mylist)
>>> deepcopy[0].append(2)
>>> deepcopy
[[1, 2], 3]
>>> mylist
[[1], 3]
>>>

Note: when you pass a variable by reference, it's a 2 way thing of course. If you edit the original on, it'll change the reference, and if you edit the reference, you change the original too.

Tamir Bahar • Sep 15 '17

Basically, all types in Python are passed-by-reference. The distinction is between mutable and immutable types. I recommend reading Python's Data Mode. The first couple of paragraphs are really helpful.

Me, myself, and Irenne • Nov 23 '17

When to use self in classes and methods

View full discussion (17 comments)