I try to do all of my interactive Python development with either Jupyter notebooks or an IPython session. One of the main reasons I like these environments is the %autoreload
magic. What’s so special about %autoreload
and why does it often make development faster and simpler?
Why IPython and Jupyter?
Before going further, if you haven’t yet used both IPython and Jupyter, check out the ipython interactive tutorial first. It does a good job of explaining why using IPython is superior to the default Python interpreter. It has a host of useful features, but in this article I will only be talking about one feature (magics) and specifically one of those magics (%autoreload
). Jupyter notebooks, like IPython, support most of the same magics, so much of the tutorial will work in either an interactive IPython session or a Jupyter notebook session. One thing to note is that I’m talking about Python here, not other languages running in a Jupyter notebook.
What is a magic?
Magics are just special functions that you can call in your IPython or Jupyter session. They come in two forms: line and cell. A line magic is prefixed with one %
, a cell magic is prefixed with two, %%
. A line magic consumes one line, whereas a cell magic consumes the lines below the magic, allowing for more input. For this article, we’ll look at just one of the line magics, the %autoreload
magic.
Why autoreload?
The %autoreload
magic changes the Python session so that modules are automatically reloaded in that session before entering the execution of code typed at the IPython prompt (or the Jupyter notebook cell). What this means is that modules loaded into your session can be modified (outside your session), and the changes will be detected and reloaded without you having to restart your session.
This can be tremendously useful. Let me describe a typical scenario. Let’s say you have a Jupyter notebook that you’ve created and are enhancing, and you require data from several sources. You get the data by executing functions in modules you import at the beginning of your session, and those modules are Python code that you control. This will be a very typical use case for many users. Futhermore, let’s say in your notebook you load all the data into memory and this takes a full 5 minutes. You then start to work with the data and soon realize that you need slightly different data from one of the functions in one of the modules you control, so you need to add another parameter to query data differently. How do you
- Make this change
- Test this change
- Continue your work
In most cases you will open the underlying code in your editor or IDE, modify it, test it in another session (or with unit tests), then optionally install changes locally. But what about the notebook that already has some of the data already loaded? One way to continue your work is to restart your Jupyter kernel to pick up the changes you just made, reload all data into memory (taking 5 minutes at least), and then continue your work.
But there’s a better way, using autoreload
. In your Jupyter session, you first load the autoreload
extension, using the %load_ext
magic.
%load_ext autoreload
Now, the %autoreload
magic is available in your session. It can take a single argument that specifies how autoreload
ing of modules will behave. The extension also provides another magic, %aimport
, which allows for fine-grained control of which modules are affected by the autoreload. If no arguments are given to %autoreload
, then it will reload all modules immediately (except those excluded by %aimport
as seen below). You can run it once and then use your updated code.
The optional argument for autoreload
has three valid values:
- 0 – disable automatic reloading
- 1 – reload all the modules imported by
%aimport
every time before executing Python code that has been typed - 2 – reload all modules (except those excluded by
%aimport
) every time before executing Python code that has been typed
To regulate the modules affected by autoreload
, use the %aimport
magic. It works as follows:
- no arguments – lists the modules that will be imported or not imported
- with one argument – the module provided will be imported with
%autoreload 1
- with comma separated arguments – all modules in list will be imported with
%autoreload 1
- with a
-
before argument – that module will not be autoreloaded
For me, the most common way I use %autoreload
is to just include everything during my initial development work when I’m likely to be changing Python modules and notebook code (i.e. to run %autoreload 2
), and to not use it at all otherwise. But having the control can be useful, especially if you are loading a lot of modules.
Example
For a concrete example that you can use to follow along, make two Python files, auto.py
and auto2.py
, and save them alongside a Jupyter notebook with the imports below. Each of the Python files should have a simple function in them, as follows:
# in auto.py
def my_api(model, year):
# dummy result
return { 'model': model, 'year': year, }
# in auto2.py
def my_api2(model, year):
# dummy result
return { 'model': model, 'year': year, }
Now, let’s import both modules and inspect the API methods using the IPython/Jupyter help by appending a ?
to the function. You should see that imported module matches your code in the Python file.
import auto
import auto2
auto.my_api?
Signature: auto.my_api(model, year)
Docstring: <no docstring>
File: ~/projects/python_blogposts/tools/auto.py
Type: function
Now, in a separate editor, add a third argument (maybe have it take a third color
argument) to the auto.my_api
function. Save the file. Do we see it? Refresh the help cell to see.
No, not yet. Let’s turn on autoreload.
%autoreload 2
Now, when I inspect auto.my_api
, I see the new argument. It worked!
Now I can modify settings so that only the auto2
module is reloaded, not auto
. But first, let’s see the modules to reload and skip. By default, it includes all modules and skips none (because I used 2
as the initial argument).
%aimport
Modules to reload:
Modules to skip:
Let’s turn off auto
.
%aimport -auto
%aimport
Modules to reload:
Modules to skip:
auto
Now, if I modify the code in auto
, I shouldn’t see the changes in this session. Using %aimport
you can restrict which code is being reloaded.
Caveats
It’s important to note that module reloading is not perfect. You should not leave this on for production code, it will slow things down. Also, if you are live editing your code and leave it in a broken state, the most recent successfully loaded code will be the code running in your session, so it can make things confusing for you. This is probably not the way you want to modify large amounts of code, but when making incremental changes, it can work well.
To observe what broken code will look like, open the module that is being autoreloaded (auto2.py
) and add a syntax error (for example, maybe put in mismatched parens somewhere) and save the file, then execute the function from that module in a notebook cell. You should see autoreload
report a traceback of the syntax error in the cell. You’ll only see this error once, if you re-execute the cell it won’t show you the same error, but will use the version of the code last loaded.
Also, note that there are a few things that don’t work all the time, like removing functions from a module, changing a @property in a class to an ordinary method, or reloading C extensions. In those cases, you’ll need to restart your session. You can see more details in the docs.
Summary
If you’ve never used %autoreload
before, give it a try next time you have an IPython or Jupyter session with a lot of data in it and want to make a small change to a local module. Hopefully it will save you some time.
The post Using autoreload to speed up IPython and Jupyter work appeared first on wrighters.io.
Top comments (0)