When writing Python code, there are many reserved words such as None
, del
, if
, else
, etc. These are words that carry special meaning to the Python interpreter and as such should not be used as variable names. The keyword import
is a reserved word used to bring outside Python code into the current scope within a file – or at least that is what it is most commonly used for. Did you know, however, that you can import
anything you want into Python?
Imagine you have a .json
file of recipes that you want to use inside a Python file. How would you load it? Maybe your first instinct would be to do the following:
import json
if __name__ == ' __main__':
with open("recipes.json") as f:
recipes = json.load(f)
Now you have the recipes as a Python dict
to use how you want. Now imagine that you could write it as the following instead:
import recipes
How can this be? As it turns out, the Python import
system is built so that you can modify the behavior of the import
statement to choose what you want to do when import
is called.
If you want to know some examples of what is possible using the methods described in this post you can check out a project a wrote called Toadstool that implements many of these techniques. There I show how you can load GraphQL queries, JSON files, config files, CSV files, and more directly from an import
statement.
Now, let's take a deeper dive into the Python import system.
The Python Import System
The full detail of the import system is described here, but for now I will give a high-level overview. As the docs state:
The import statement combines two operations; it searches for the named module, then it binds the results of that search to a name in the local scope
So, you have two different points in which you can modify the behavior of the import
statement.
- You can can modify its search function, or how it maps the value after the import statement to an action, and you can modify how it maps the.
- You can modify the binding function, or what action to take once the "module" is found . <!--kg-card-end: markdown--> ## Finders and Loaders
The first step is know as Searching while the second step is know as Loading. Without modifying the import
system, the default finders (modules that perform the searching operation) and loaders are used. As with all things in Python, the searcher and loaders are just objects. You can even inspect them for yourself.
❯ python3
>>> import sys
>>> sys.meta_path
[<class '_frozen_importlib.BuiltinImporter'>, <class '_frozen_importlib.FrozenImporter'>, <class '_frozen_importlib_external.PathFinder'>]
All loaded modules will be cached in a dict
called sys.modules
which maps module names to all of the variables exported from the module. When importing a module with import X
, Python will check if X
is already in sys.modules
and only resort to using its finders and loaders if X
is not already present.
Here is a brief flowchart of what I described above.
Customizing the Import System
💡
There are two import
hooks that can be used to extend the import system behavior: meta hooks and import path hooks. We will only be focusing on meta path hooks in this post but you can read more about import hooks here.
The Meta Path
As you saw from the snippet above, finders and loaders are stored in sys.meta_path
. When import X
is called, if X
is not found in sys.modules
, then Python will iterate over all of the objects stored in sys.meta_path
and check if they implement the find_spec
method which can be used to find X
. If the finder returns a spec, then Python will invoke the exec_module
function to invoke the loading step. Python’s default sys.meta_path
has three meta path finders, one that knows how to import built-in modules, one that knows how to import frozen modules, and one that knows how to import modules from an import path (i.e. the path based finder).
So to modify how import
works you can simply modify the contents of sys.meta_path
to include finders and loaders that work the way you want. That means that you must create an object that implements the find_spec
and exec_module
methods and add it to sys.meta_path
. Please note that if you wish to supersede default module loading behavior (typically modules that end in .py
), then you will either have to delete the default modules from sys.meta_path
or prepend your custom modules such that they will be invoked before the default finders.
💡
Within the sys.meta_path
you can implement a single object to perform both searching and loading, but it is important to understand that the operations are separate and constitute two different steps of the import process.
Custom Searching
To implement custom search behavior, we must have a class that implements the find_spec
method. The full signature of the method is find_spec(fullname, path, target=None)
. Here, path
refers to the path that of the parent module, so if path is None
then it is a top-level import, otherwise path
will contain the name of the parent module. fullname
is the name passed to the import
statement, so for import A.B.C
, fullname
would be "A.B.C"
. target
is a module object that the finder may use to make a more educated guess about what spec to return. find_spec
should either return None
if the finder could not find the module, or a ModuleSpec
if it is found. What it means to "find the module" is up to your implementation.
So let's say we want to implement a finder for .json
files. It might look something like
class JsonLoader():
"""
Used to import Json files into a Python dict
"""
@classmethod
def find_spec(cls, name, path, target=None):
"""Look for Json file"""
package, _, module_name = name.rpartition(".")
filename = f"{module_name}.json"
directories = sys.path if path is None else path
for directory in directories:
path = pathlib.Path(directory) / filename
if path.exists():
return ModuleSpec(name, cls(path))
⚠️
Note that we use the @classmethod
property here. This is necessary so that Python does not have to instatiate a new JSONLoader
object to invoke find_spec
since the class JsonLoader
itself is what is stored in sys.meta_path
.
Line 8 simply extracts the part of the fullname that we care about (everything after the last .
in the name). Line 9 is where we specify that we're only looking for files of the given name that end in .json
. Line 10 is where we specify which directories to search over. This is either the sys.path
is no path
is passed to find_spec
or it's the path
. Again, this is just the behavior we are choosing to implement here – you can use these parameters that are passed to find_spec
however you see fit. Lastly, lines 11-14 are checking for the existance of the .json
file, and if found, returning a new ModuleSpec
with the name and the path and a new instance of the JsonLoader
class initialized with the path (we will use this with the nex step). This does no action regarding even opening the file, let alone binding new variables. That is what will be done by the loader. Because of how we are constructing this class, JsonLoader
will also serve as the loader, which is why we passed cls(path
to ModuleSpec
on line 14, but you could also pass any other class that implements the exec_module
method.
Custom Loading
Now that we have identified the .json
file we have to determine what to do with it. For that, we will use the same class as before and just implement a new method: exec_module
since our finder deferred to JsonLoader
again with its call to ModuleSpec
.
class JsonLoader():
"""
Used to import Json files into a Python dict
"""
def __init__ (self, path):
"""Store path to Json file"""
self.path = path
@classmethod
def find_spec(cls, name, path, target=None):
"""Look for Json file"""
package, _, module_name = name.rpartition(".")
filename = f"{module_name}.json"
directories = sys.path if path is None else path
for directory in directories:
path = pathlib.Path(directory) / filename
if path.exists():
return ModuleSpec(name, cls(path))
def create_module(self, spec):
"""Returning None uses the standard machinery for creating modules"""
return None
def exec_module(self, module):
"""Executing the module means reading the JSON file"""
with self.path.open() as f:
data = json.load(f)
fieldnames = tuple(_identifier(key) for key in data.keys())
fields = dict(zip(fieldnames, [to_namespace(value) for value in data.values()]))
module. __dict__.update(fields)
module. __dict__ ["json"] = data
module. __file__ = str(self.path)
On line 5 we implement the class' __init__
method to save the path that was passed to the find_spec
method. The path is not passed to exec_module
by default, so this is a way of maintaining that information. On line 24 we define the exec_module
method which is where we perform any bindings. Here we read the json file, identify all keys from the JSON file, convert those names to valid Python variable names if needed, and update the calling module's __dict__
(which stores all variables in scope) to now contain the values from the JSON file.
Customized Result
Now append JsonLoader
to sys.meta_path
and assume you have the following employees.json
file:
{
"employee": {
"name": "sonoo",
"salary": 56000,
"married": true
},
"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{"value": "New", "onclick": "CreateDoc()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Save", "onclick": "SaveDoc()"}
]
}
}
}
Now you can load employees.json
and use it as follows:
import employees
>>> employees.
employees.employee employees.json employee.menu
print(employee.menu)
> {'id': 'file', 'value': 'File', 'popup': {'menuitem': [{'value': 'New', 'onclick': 'CreateDoc()'}, {'value': 'Open', 'onclick': 'OpenDoc()'}, {'value': 'Save', 'onclick': 'SaveDoc()'}]}}
You can even specify only importing certain keys from the file.
from employees import employee
This will only load employee
from employees.json
and NOT load the menu
key.
Conclusion
This was just a sample of what can be done by using Python's import hooks (and remember, it only covered one of the two available hooks). You can customize this behavior to your heart's desire. If you want to see a more generalized approach you can check out my project I mentioned at the top at https://github.com/acalejos/toadstool/. The package is able to be installed via pip install toadstool
if you want to try out some of the loaders.
If you need more dynamic import behavior within you code, you can also look to importlib
, which is described as having three main purposes
- Provide the implementation of the import statement (and thus, by extension, the import () function) in Python source code.
- Expose the components to implement
import
and thus giving users the ability to create their ownimporters
- Contains modules exposing additional functionality for managing aspects of Python packages <!--kg-card-end: markdown-->
As you can see, the topic of the Python import system goes very deep, so I would encourage you to explore it further and gain a better understanding of it.
Top comments (0)