loading...
Cover image for Python Module vs. Package

Python Module vs. Package

bowmanjd profile image Jonathan Bowman ・4 min read

A Python module is a single file, and a package is a folder with potentially multiple files.

(In Python culture, a "package" may also refer to a distribution package, such as is bundled up by setuptools. But that is not the usage in the context of this article.)

The module

Here is a single module, a file I named hello.py:

def hello():
  return "Hello"

def hello_there():
  return "Hello, there!"
Enter fullscreen mode Exit fullscreen mode

As demonstrated above, this module can have multiple members, such as functions or classes.

Place the file in the current working directory, run Python, and you can import and use the functions like this:

>>> import hello
>>> hello.hello()
'Hello'
>>> hello.hello_there()
'Hello, there!'
Enter fullscreen mode Exit fullscreen mode

The package

A Python package, on the other hand, is a directory/folder that can potentially have multiple modules (files), and even subpackages (more directories, more files). The directory tree may look as simple as this:

hello
└── __init__.py
Enter fullscreen mode Exit fullscreen mode

That's a directory called hello with a single __init__.py file in it. Yes, it must be named __init__.py (and is often pronounced "dunder init", short for "double underscore init double underscore").

If you place the simple hello functions above in the __init__.py
file inside of the hello directory, instead of in a hello.py file (delete hello.py if you still have it), your imports will work the same:

>>> import hello
>>> hello.hello()
'Hello'
Enter fullscreen mode Exit fullscreen mode

When do I use just a module vs. a package?

This is really a question of namespaces and organizational hierarchy. If your needs are really simple, a single module will do.

For instance, in the above scenario, we didn't really gain anything by moving our functions into the __init__.py file in a separate directory. In fact, we could argue that it added an additional but unnecessary layer.

But if you want to have submodules (and you often will), then a package is in order.

I don't like to put much more than ten functions in a single module, simply for readability. Usually, when a single module becomes more complex, that is a sign that some segmentation (into multiple files) is necessary. The discipline of segmenting concerns has often allowed me to think through my code, eliminate repetition, and increase reusability.

Structuring packages

I will simplify things here; you might also read this article that thoughtfully reflects on the topic.

With packages, we can segment into submodules. In this case, let's create two submodules of hello so that the structure looks like this:

hello/
├── __init__.py
├── formal.py
└── informal.py
Enter fullscreen mode Exit fullscreen mode

In formal.py, the following function resides:

def greet():
    return "Greetings!"
Enter fullscreen mode Exit fullscreen mode

And in informal.py:

def hey():
    return "Hey, there!"
Enter fullscreen mode Exit fullscreen mode

The next steps are really a question of user interface; to be more descriptive, the developer interface. How do you want developers using your package to access the package members?

Explicit namespaces

If you want the developer using your package to acknowledge and use the same segments you created for organization, then leave the __init__.py file empty. In fact, you can remove it altogether. The following usage is possible:

>>> import hello.informal
>>> import hello.formal
>>> hello.informal.hey()
'Hey, there!'
>>> hello.formal.greet()
'Greetings!'
Enter fullscreen mode Exit fullscreen mode

Note that import hello will not import informal and formal; the developer must name those modules similarly to the above in order to access the member functions.

If you remove the __init__.py, this kind of package is called a namespace package and is natively supported in Python 3.3 and later. The behavior is more or less the same as if you had a blank __init__.py file.

Convenience API

Usually, however, I like to use __init__.py to provide a more convenient interface to the developer. This also allows me to restructure my code later without breaking the developer API.

If we keep the __init__.py, and place the following code in it:

from .formal import greet
from .informal import hey
Enter fullscreen mode Exit fullscreen mode

Then the developer can access the member functions with one import:

>>> import hello
>>> hello.hey()
'Hey, there!'
>>> hello.greet()
'Greetings!'
Enter fullscreen mode Exit fullscreen mode

While hello.informal.hey() and hello.formal.greet() also still work, you can decide whether or not to encourage the developer to use them in that way.

And, yes, you could just use from .informal import * and the like if you want all of your member functions and classes to be dumped into the same namespace. I prefer to explicitly define which functions and classes are included, for cleanliness and control.

I hope the next Python code you write is usable in the long term, and a source of pride for you!

Discussion

pic
Editor guide