Disclaimer : If you write Python on a daily basis you will find nothing new in this post. It’s for people who occasionally use Python like Ops guys and forget/misuse its import system. Nonetheless, the code is written with Python 3.6 type annotations to entertain an experienced Python reader. As usual, if you find any mistakes, please let me know!
Let’s start with a common Python stanza of
if __name__ == '__main__': invoke_the_real_code()
A lot of people, and I’m not an exception, write it as a ritual without trying to understand it. We somewhat know that this snippet makes difference when you invoke your code from CLI versus import it. But let’s try to understand why we really need it.
For illustration, assume that we’re writing some pizza shop software. It’s on Github. Here is the
# pizza.py file import math class Pizza: name: str = '' size: int = 0 price: float = 0 def __init__(self, name: str, size: int, price: float) -> None: self.name = name self.size = size self.price = price def area(self) -> float: return math.pi * math.pow(self.size / 2, 2) def awesomeness(self) -> int: if self.name == 'Carbonara': return 9000 return self.size // int(self.price) * 100 print('pizza.py module name is %s' % __name__) if __name__ == '__main__': print('Carbonara is the most awesome pizza.')
I’ve added printing of the magical
__name__ variable to see how it may change.
OK, first, let’s run it as a script:
$ python3 pizza.py pizza.py module name is __main__ Carbonara is the most awesome pizza.
__name__ global variable is set to the
__main__ when we invoke it from CLI.
But what if we import it from another file? Here is the
menu.py source code:
# menu.py file from typing import List from pizza import Pizza MENU: List[Pizza] = [ Pizza('Margherita', 30, 10.0), Pizza('Carbonara', 45, 14.99), Pizza('Marinara', 35, 16.99), ] if __name__ == '__main__': print(MENU)
$ python3 menu.py pizza.py module name is pizza [<pizza.Pizza object at 0x7fbbc1045470>, <pizza.Pizza object at 0x7fbbc10454e0>, <pizza.Pizza object at 0x7fbbc1045b38>]
And now we see 2 things:
- The top-level
__name__in pizza.py is now set to the filename without
So, the thing is,
__name__ is the global variable that holds the name of the current Python module.
- Module name is set by the interpreter in
- When module is invoked from CLI its name is set to
So what is the module, after all? It’s really simple - module is a file containing Python code that you can execute with the interpreter (the
pythonprogram) or import from other modules.
- Python module is just a file with Python code
Just like when executing, when the module is being imported, its top-level statements are executed, but be aware that it’ll be executed only once even if you import it several times even from different files.
- When you import module it’s executed
Because modules are just plain files, there is a simple way to import them. Just take the filename, remove the
.py extension and put it in the
- To import modules you use the filename without the
What is interesting is that
__name__ is set to the filename regardless how you import it – with
import pizza as broccoli
__name__ will still be the
- When imported, the module name is set to filename without
.pyextension even if it’s renamed with
import module as othername
But what if the module that we import is not located in the same directory, how can we import it? The answer is in module search path that we’ll eventually discover while discussing packages.
- Package is a namespace for a collection of modules
The namespace part is important because by itself package doesn’t provide any functionality – it only gives you a way to group a bunch of your modules.
There are 2 cases where you really want to put modules into a package. First is to isolate definitions of one module from the other. In our
pizza module, we have a
Pizza class that might conflict with other’s Pizza packages (and we do have some pizza packages on pypi)
The second case is if you want to distribute your code because
- Package is the minimal unit of code distribution in Python
Everything that you see on PyPI and install via
pip is a package, so in order to share your awesome stuff, you have to make a package out of it.
Alright, assume we’re convinced and want to convert our 2 modules into a nice package. To do this we need to create a directory with empty
__init__.py file and move our files to it:
pizzapy/ ├── __init__.py ├── menu.py └── pizza.py
And that’s it – now you have a
- To make a package create the directory with
Remember that package is a namespace for modules, so you don’t import the package itself, you import a module from a package.
>>> import pizzapy.menu pizza.py module name is pizza >>> pizzapy.menu.MENU [<pizza.Pizza object at 0x7fa065291160>, <pizza.Pizza object at 0x7fa065291198>, <pizza.Pizza object at 0x7fa065291a20>]
If you do the import that way, it may seem too verbose because you need to use the fully qualified name. I guess that’s intentional behavior because one of the Python Zen items is “explicit is better than implicit”.
Anyway, you can always use a
from package import module form to shorten names:
>>> from pizzapy import menu pizza.py module name is pizza >>> menu.MENU [<pizza.Pizza object at 0x7fa065291160>, <pizza.Pizza object at 0x7fa065291198>, <pizza.Pizza object at 0x7fa065291a20>]
Remember how we put a
__init__.py file in a directory and it magically became a package? That’s a great example of convention over configuration – we don’t need to describe any configuration or register anything. Any directory with
__init__.py by convention is a Python package.
Besides making a package
__init__.py conveys one more purpose – package initialization. That’s why it’s called init after all! Initialization is triggered on the package import, in other words, importing a package invokes
- When you import a package, the
__init__.pymodule of the package is executed
__init__ module you can do anything you want, but most commonly it’s used for some package initialization or setting the special
__all__ variable. The latter controls star import –
from package import *.
And because Python is awesome we can do pretty much anything in the
__init__module, even really strange things. Suppose we don’t like the explicitness of import and want to drag all of the modules’ symbols up to the package level, so we don’t have to remember the actual module names.
To do that we can import everything from
pizza modules in
__init__.py like this
# pizzapy/__init__.py from pizzapy.pizza import * from pizzapy.menu import *
>>> import pizzapy pizza.py module name is pizzapy.pizza pizza.py module name is pizza >>> pizzapy.MENU [<pizza.Pizza object at 0x7f1bf03b8828>, <pizza.Pizza object at 0x7f1bf03b8860>, <pizza.Pizza object at 0x7f1bf03b8908>]
menu.MENU :-) That way it kinda works like packages in Go, but note that this is discouraged because you are trying to abuse the Python and if you gonna check in such code you gonna have a bad time at code review. I’m showing you this just for the illustration, don’t blame me!
You could rewrite the import more succinctly like this
# pizzapy/__init__.py from .pizza import * from .menu import *
This is just another syntax for doing the same thing which is called relative imports. Let’s look at it closer.
The 2 code pieces above is the only way of doing so-called relative import because since Python 3 all imports are absolute by default (as in PEP328), meaning that import will try to import standard modules first and only then local packages. This is needed to avoid shadowing of standard modules when you create your own
sys.py module and doing
import sys could override the standard library
- Since Python 3 all import are absolute by default – it will look for system package first
But if your package has a module called
sys and you want to import it into another module of the same package you have to make a relative import. To do it you have to be explicit again and write
from package.module import or
from .module import somesymbol. That funny single dot before module name is read as “current package”.
- To make a relative import prepend the module with the package name or dot
In Python you can invoke a module with a
python3 -m <module> construction.
$ python3 -m pizza pizza.py module name is __main__ Carbonara is the most awesome pizza.
But packages can also be invoked this way:
$ python3 -m pizzapy /usr/bin/python3: No module named pizzapy. __main__ ; 'pizzapy' is a package and cannot be directly executed
As you can see, it needs a
__main__ module, so let’s implement it:
# pizzapy/__main__.py from pizzapy.menu import MENU print('Awesomeness of pizzas:') for pizza in MENU: print(pizza.name, pizza.awesomeness())
And now it works:
$ python3 -m pizzapy pizza.py module name is pizza Awesomeness of pizzas: Margherita 300 Carbonara 9000 Marinara 200
__main__.pymakes package executable (invoke it with
python3 -m package)
And the last thing I want to cover is the import of sibling packages. Suppose we have a sibling package
. ├── pizzapy │ ├── __init__.py │ ├── __main__.py │ ├── menu.py │ └── pizza.py └── pizzashop ├── __init__.py └── shop.py # pizzashop/shop.pyimport pizzapy.menuprint(pizzapy.menu.MENU)
Now, sitting in the top level directory, if we try to invoke shop.py like this
$ python3 pizzashop/shop.py Traceback (most recent call last): File "pizzashop/shop.py", line 1, in <module> import pizzapy.menu ModuleNotFoundError: No module named 'pizzapy'
we get the error that our pizzapy module not found. But if we invoke it as a part of the package
$ python3 -m pizzashop.shop pizza.py module name is pizza [<pizza.Pizza object at 0x7f372b59ccc0>, <pizza.Pizza object at 0x7f372b59ccf8>, <pizza.Pizza object at 0x7f372b59cda0>]
it suddenly works. What the hell is going on here?
The explanation for this lies in the Python module search path and it’s greatly described in the documentation on modules.
Module search path is a list of directories (available at runtime as
sys.path) that interpreter uses to locate modules. It is initialized with the path to Python standard modules (
pip puts everything you install globally, and also a directory that depends on how you run a module. If you run a module as a file like
python3 pizzashop/shop.py the path to containing directory (
pizzashop) is added to
sys.path. Otherwise, including running with
-m option, the current directory (as in
pwd) is added to module search path. We can check it by printing
$ pwd /home/avd/dev/python-imports $ tree . ├── pizzapy │ ├── __init__.py │ ├── __main__.py │ ├── menu.py │ └── pizza.py └── pizzashop ├── __init__.py └── shop.py $ python3 pizzashop/shop.py ['/home/avd/dev/python-imports/pizzashop', '/usr/lib64/python36.zip', '/usr/lib64/python3.6', '/usr/lib64/python3.6/lib-dynload', '/usr/local/lib64/python3.6/site-packages', '/usr/local/lib/python3.6/site-packages', '/usr/lib64/python3.6/site-packages', '/usr/lib/python3.6/site-packages'] Traceback (most recent call last): File "pizzashop/shop.py", line 5, in <module> import pizzapy.menu ModuleNotFoundError: No module named 'pizzapy' $ python3 -m pizzashop.shop ['', '/usr/lib64/python36.zip', '/usr/lib64/python3.6', '/usr/lib64/python3.6/lib-dynload', '/usr/local/lib64/python3.6/site-packages', '/usr/local/lib/python3.6/site-packages', '/usr/lib64/python3.6/site-packages', '/usr/lib/python3.6/site-packages'] pizza.py module name is pizza [<pizza.Pizza object at 0x7f2f75747f28>, <pizza.Pizza object at 0x7f2f75747f60>, <pizza.Pizza object at 0x7f2f75747fd0>]
As you can see in the first case we have the
pizzashop dir in our path and so we cannot find sibling
pizzapy package, while in the second case the current dir (denoted as
'') is in
sys.path and it contains both packages.
- Python has module search path available at runtime as
- If you run a module as a script file, the containing directory is added to
sys.path, otherwise, the current directory is added to it
This problem of importing the sibling package often arise when people put a bunch of test or example scripts in a directory or package next to the main package. Here is a couple of StackOverflow questions:
The good solution is to avoid the problem – put tests or examples in the package itself and use relative import. The dirty solution is to modify
sys.path at runtime (yay, dynamic!) by adding the parent directory of the needed package. People actually do this despite it’s an awful hack.
I hope that after reading this post you’ll have a better understanding of Python imports and could finally decompose that giant script you have in your toolbox without fear. In the end, everything in Python is really simple and even when it is not sufficient for your case, you can always monkey patch anything at runtime.
And on that note, I would like to stop and thank you for your attention. Until next time!