Recently I reached a breaking point with importing local packages in python. I consistently found myself dealing with
ModuleNotFoundError's when starting on a new project. A lot of it is my fault for never sitting down and properly understanding Pythons importing mechanism until now, but a little of it has to do with a plethora of answers, and advice online that tell you how to fix a problem, but not why the fix works. I hope to bridge that gap today.
There are a few key questions I want to answer about this topic that I think will clear up most of the confusion.
- Is an
__init__.pyfile needed in every folder?
- Is a package you import the same thing as the package you install from PyPI?
- Does Pytest and Python's builtin unittest module import packages differently?
The simple answers to these questions are
- Yes if using Python 3.2 or earlier, and No if using Python 3.3 or newer.
- No, they are different even though they represent the same code.
Although Python 2 reached end of life this year, it's influence on Python devs has been profound. Even when I started writing Python several years ago every project I worked with only supported Python 2, and in Python 2 an
__init__.py file was required in every directory, otherwise it wasn't importable. You can test this for yourself. Create a directory strucutre as shown below, open up a Python REPL and try to import
greeter.lang. It doesn't work until you add an
__init__.py to the lang folder.
experiment/ |_greeter | |_ __init__.py | |_lang | | |_en.py | | |_en.py | | | |_greeting.py | |_tests | |_test_greeting.py
This behaviour was present even in the early days of Python 3. PEP-420 finally removed that requirement in Python 3.3, but the trend of adding
__init__.py files has been continued by long time Python developers because they never knew to change their habits and the habit has been passed down to newer devs either from these older devs or from historical answers on Q&A sites like Stack Overflow.
If you don't believe me that
__init__.py files are no longer needed in every directory you can read it for yourself in the Python 3 docs here or in this screenshot of the paragraph that confirms it.
I have some examples of common scenarios you'll find yourself in that I want to share with you, but before that, I figure we should define what a package actually is.
Paraphrasing from the setuptools documentation (setuptools is the Python lib for building and distributing PyPI packages),
"A package in the context of PyPI is a distribution of bundled > software. A package in the context of Python is a container of > modules".
Simply put a module is just a Python source code file.
Additionally there are 2 types of python packages now, "Regular Packages" which adhere to the old Python 3.2 and earlier requirements of a package (i.e.
__init__.py in every directory) and "Namespace Packages" that follow the requirements from PEP-420, which bring a whole new meaning to being 420 friendly.
With that in mind let's get down to the examples. Since most of my confusion came while I was following Test Driven Development all of these examples use
pytest and Python's builtin
unittest module to run a test module that imports some other python module/package of varying complexity.
The simplest case is where the test is in the same module as the code it is testing. The directory structure would look like this.
example-1/ |_ greeter.py
pytest greeter.py and
python -m unittest greeter.py pass. Since there is nothing to import this should be expected. If you want to see the source code for this example it's available on GitHub.
A slightly more complex, but more common case is when there is a test module separate from the module being testing. The directory structure would look like this.
example-2/ |_greeter.py |_test_greeter.py
In this case
pytest test_greeter.py and
python -m unittest test_greeter.py once again pass even without an
__init__.py file. The source code for this example is also available on GitHub.
More complex projects need better organization than a single source code module and a single test module. This is where python packages come into play. Examples 3, 4 and 5 all deal with the various ways a package can be structured (i.e. nesting the test module with the source code package or keeping it separate from the source package).
This example covers the case where the test module is nested within the source code package. The directory structure looks like this.
example-3/ |_translator/ | |_greeter.py | |_tests/ | | |_test_greeter.py
This time only
python -m unittest translator/tests/test_greeter.py passes the test.
pytest translator/tests/ fails with the error
ModuleNotFoundError: No module named 'translator'. That's because pytest only looks for PyPI packages to import. You can get pytest to pass by adding an
__init__.py file under the
translator directory, adding a
setup.py file under the example-3 directory and running
pip install .. You can see the source code here and the PR that fixes it here.
This is the same as example 3 except the tests package is outside the source code package.
example-4/ |_ translator/ | |_ greeter.py | |_ tests | |_ test_greeter.py
python -m unittest tests/test_greeter.py passes but
pytest tests/ fails with a
ModuleNotFoundError. The fix is exactly the same as example 3. Add an
__init__.py file under the
translator directory and a
setup.py file under the
example-4 directory then run
pip install .. The source code can be seen here and the PR to fix it can be seen here.
Example 5 is a little more complex because there is a subpackage under the
translator package. The directory stucture will look like that.
example-5/ |_translator/ | |_ greeter.py | |_ lang/ | | |_ en.py | | |_ es.py | |_tests | |_test_greeter.py
As you've probably already guessed
python -m unittest tests/test_greeter.py passes and
pytest tests/ failes with
ModuleNotFoundError. If you apply the same fix from examples 3 and 4, pytest still fails, but this time with the error
ModuleNotFoundError: No module named 'lang'. This error is caused by the
setup.py file i've been using. In
setup.py I use the function
find_packages() from setuptools which traverses the directory structure and tries to find packages, but it's only looking for "Regular Packages". That's right the ones that require an
__init__.py file to be present. So to fix this example an
__init__.py needs to be added under the
lang directory as well. Unless you used the
-e flag in yout pip install command you'll need to re-run
pip install . again since the
__init__.py has to exist before the PyPI package was installed.
I hope that can save you hours of debugging import issues so you can focus on the more enjoyable parts of coding. Here are some best practices I've taken away from this journey.
__init__.pyfiles should only be used when needed by setuptools or when a package needs some intial setup on import.
- Tests should remain outside the source code package (Can help with things like Docker which only wants Prod code)
pip install -e .will pick up new subpackages that have
__init__.pyw/o needing to rerun
pip install .
A good next step would be to look into importlib which exposes
the implementation of the import statement.
Thanks for reading. If you enjoy this content check out my other articles on DEV.to, or tune into the Namespace Podcast which I co-host with another TDD, Python junkie like myself.
Top comments (3)
This is an excellent write-up. Thanks, Derek, for posting this!
Hi Derek, that was the most concise summery of the ModuleNotFoundError issue I ever read. Really fantastically done.
Being one of those old devs who has missed the init.py change in 3.3 this was especially valuable.
Thank you so much for putting this together.
I did find a little typo in exemple 2 I think "python -m unittesttest_greeter.py" wanted to be "python -m unittest test_greeter.py"
Thanks for pointing that out. It's fixed now. I'm glad the article was valuable. I grappled with when/how to use
__init__.pyfor years before sitting down and just figuring it out. I was surprised by a lot of it so I knew I had to share.