How do you write tests?
Python has many frameworks for writing tests, but ultimately they come down to collecting functions whose names begin with test
from an organized namespace. This article, being the sequel to "Python unit testing with Mock - Part One", isn't here to help organize your namespace (for the record, that's best done with py.test
which deserves its own glorious article).
This article will show you how to write tests, and understand when and how to use the mock
module. It's one thing to just know "oh, that's how a mock works" but another entirely to know how to apply a mock to test your code properly. This article seeks to address that bit of knowledge.
How do I actually use mock now?
Having all the tools doesn't necessarily mean knowing how to use them! When I first read the documentation for mock, I found it baffling. I really didn't understand how to put it all together. Here's some practical-ish examples of how to read your code and decide how to craft a test.
Example #1
Let's touch upon an example similar to one we witnessed in the previous article. Here's a function that uses the requests
library.
def get_example():
r = requests.get('http://example.com/')
return r.status_code == 200
In this function, we have an instantiated request r
and view the status_code
property on the object. The requests
object is what talks to the outside world, but our code that implements requests
is what we want to test out.
What we need to do is replace the return of requests.get()
with a mock. We're going to use a variety of the features of mock to replace it properly.
@mock.patch('requests.get', autospec=True)
def test_get_example_passing(mocked_get):
mocked_req_obj = mock.Mock()
mocked_req_obj.status_code = 200
mocked_get.return_value = mocked_req_obj
assert(get_example())
mocked_get.assert_called()
mocked_get.assert_called_with('http://example.com/')
Let's walk through this to show what we're doing and why
-
mock.patch
patchesrequests.get
with a mock (meaning theget
function in therequests
library) usingautospec=True
to match the signature ofrequests.get
. - This patched object is made available as the first argument, which we receive in
mocked_get
. - We know we need to create a return to feed into
r
in the code we're testing, so we createmocked_request_object
. In the code, we only examine thestatus_code
, so we declare that value - in this case, 200 - We then assign
mocked_request_object
to thereturn_value
ofmocked_get
, which is what will be returned when the patched function gets called - Since we know the function effectively returns boolean, we can call a simple
assert
on the function's return. - To ensure the mock was called properly, we call the built-in assertion methods.
assert_called
ensures the mockedrequests.get
was actually called, andassert_called_with
checks the args. We can do similar introspection on the returned mock as well.
Now this seems like a lot of test code (8 lines) for a small function (3 lines). It's often the case that a python unit test could have more LOC than the code it's testing. We don't even have complete test coverage, because 200 isn't the only valid response! Test code should write faster than product code, since it doesn't involve the same level of thought or design.
This was only an example to show how to design a mock to replace an interface library. The principle here is to understand what part of your code talks to the public and do the minimum to replace it.
Example #2
It's called a "mock" because it can pretend to be something real. We know our code, so now we can examine our code to determine how to use a mock. We have a good overview of the parts and usage, but implementation is everything.
When looking at your code and deciding how to test, you want to run a test that ensure the desired outputs and side effects happen when executing your code. Ideally, you want to ensure at least one, if not multiple, tests happen for every line of code. So, tests should cover each function and mocks should prevent those functions from hitting the outside world.
Here's an example of a module that writes to a database:
class DBWriter(object):
counter = 0
def __init__(self):
self.db = DBLibrary()
def commit_to_db(self, sql):
self.counter += 1
self.db.commit(sql)
def save(self, string):
sql = "INSERT INTO mytable SET mystring = '{}'".format(string)
self.commit_to_db(sql)
def drop(self, string):
sql = "DELETE FROM mytable WHERE mystring = '{}'".format(string)
self.commit_to_db(sql)
Now this module uses a made-up API module called DBLibrary
. Let's assume DBLibrary
is an external library that is already trusted to work. We know that such a library has a couple features to note:
- It is outside of this code
- It may already have its own tests - even if it doesn't, refer to previous point
- It can change state in a database
Those are all perfectly qualifying characteristics for something ripe for being mocked away. Now that said, we need to write tests for DBWriter
, which is OUR library. Let's look at the behaviors of this module. The effective save
and drop
actions do the following:
- Prepare the sql statement
- Write the statement to the database
- Increment the counter
All of these behaviors are elements to test, but we still want to protect the database from any writes. How do we do it?
Model #1 - Patch commit_to_db
The actual writing to the database happens in the commit_to_db
function. We can replace this function with a mock and replicate the behavior. It does two things:
- Increments the counter
- Calls
self.db.commit
to commit to the database
Here's how you write that test:
@mock.patch('dbwriter.DBWriter.commit_to_db', autospec=True)
def test_save(mock_commit):
writer = DBWriter()
def fake_commit(self, sql):
writer.counter += 1
mock_commit.side_effect = fake_commit
writer.save("Hello World")
mock_commit.assert_called_with(writer,
"INSERT INTO mytable SET mystring = 'Hello World'")
assertEquals(writer.counter, 1)
Note the namespace assumes DBWriter
is in a filename dbwriter.py
or otherwise such organized module. You'll need to know your module's namespace for this patch to work.
The @mock.patch
decorator replaces the commit_to_db
function in place. We declare autospec=True
which matches the signature to ensure commit_to_db
is called correctly.
We also add a side effect, which is something that happens besides containment of a return value. commit_to_db
doesn't just write to a database, but it also increments a counter. Adding this behavior in as a side effect ensures this action doesn't omitted when the mock is called. Since we're replacing the function with a mock, the actual code of that function never gets called.
Calling the mock in this way gives us the following benefits:
- Introspection into
commit_to_db
: we can call mock's introspection tools to ensure that the function is receiving the right arguments - Insulation from write actions: We don't bring up an equivalent action to
self.db.commit
in the test, which means we've removed the action that can potentially write to the database.
However, this comes with some drawbacks:
- Lost introspection into
self.db.commit
: We may be insulated, but we don't know if we're calling the function right. - No exercise of the actual
commit_to_db
: We're re-writing the non-state-changing code in our unit test! If that line were something more complex, or if there was a ton of code in here, we'd have to reimplement it all!
Now let's try another approach:
Model 2: Patch self.db.commit
@mock.patch('dbwriter.DBLibrary', autospec=True)
def test_save(mock_dblib):
writer = DBWriter()
writer.save("Hello World")
mock_dblib.return_value.commit.assert_called_with(writer,
"INSERT INTO mytable SET mystring = 'Hello World'")
Note that we're patching DBLibrary
directly in the dbwriter
namespace, which prevents patching that library elsewhere. This is direct surgery.
Doing this replaces the DBLibrary
with a mock. Again, autospec=True
will ensure that calls to commit
and any other implemented methods will respect calls that match the given signatures of those methods.
Here's what we get for testing this way:
- Insulation from the database: same as before, just in a different way
- Introspection into how we use the
DBWriter
module: notice w're callingassert_called_with
and determining how we call the function that actually writes to the database, with the same protections - Full exercise of every line of code in our test: only external code is replaced by the mock
Here's what we lose:
- Introspection into the parent function: We don't see how the function is called by it's neighbors inside the module
Luckily, both tests are valid and can be done in parallel. If you're not sure which test to write, write both! You can never have too much test coverage.
Example #3
I'm often asked about how to mock something that gets instantiated multiple times in an object. A class coming up will instantiate a module twice (e.g., two separate database connections). Just naïvely patching the base module may give you one instance that gets used twice, or just spit out default mocks that don't help you at all. It took me some research and experimentation to figure out how to do this.
So, here's some example code:
from some.library import AnotherThing
class MyClass(object):
def __init__(self, this, that):
self.this = AnotherThing(this)
self.that = AnotherThing(that)
def do_this(self):
self.this.do()
def do_that(self):
self.that.do()
def do_more(self):
got_it = self.this.get_it()
that_too = self.that.do_it(got_it)
return that_too
self.this
and self.that
are both instances of AnotherThing
but initialized with this
and that
respectively. They're both used throughout the code. We want to patch out AnotherThing
so we can decidedly test our code out.
We can take two approaches here, with some caveats. Ultimately, the options you choose have to do with how you implement the code.
Approach #1: Patch this
and that
directly
This approach works if you know for a fact that __init__
for MyClass and the initialization of AnotherThing
didn't actively change state:
def test_my_class():
my_obj = MyClass("fake this", "fake that")
my_obj.this = Mock(spec_set='some.library.AnotherThing')
my_obj.that = Mock(spec_set='some.library.AnotherThing')
my_obj.do_this()
my_obj.this.do.assert_called()
my_obj.do_that()
my_obj.that.do.assert_called()
In this approach, we instantiate and then patch the result in place. It's a straightforward approach that only really works when you know that your code will fit the pattern. It's highly unlikely that state changes happen in __init__
code, but entirely possible. It's also more likely that instantiating AnotherThing
will be slow and expensive, even if it wasn't changing any external state. So this may be less confusing, it's not necessarily better.
Approach #2: Mock Generator
AnotherThing
returns an instance of AnotherThing
, so why not just change that action to spit out mocks?
@patch('yourcode.AnotherThing', autospec=True)
def test_my_class(mock_thing):
def fake_init(*args):
return Mock(*args, spec_set='some.library.AnotherThing')
mock_thing.side_effect = fake_init
my_obj = MyClass("fake this", "fake that")
my_obj.this.called_with("fake this")
my_obj.that.called_with("fake that")
mock_thing
is the in-namespace patch to replace AnotherThing
. We take advantage of using side_effect
to return a spec_set
mock. Now we've patched AnotherThing
using mock.patch
to spit out mocks that look like the objects it creates. All my patching happens before instantiation, so there's no modification of the code object being tested here.
Additionally, the mocks permit introspection, so we get to test usage as well!
To mock or not to mock?
There are times when it's necessary and appropriate to mock, and we've covered some examples of when it's absolutely necessary for the scope of a unit test:
- Calling an API
- Connecting to a database
- Using the command line
Those are some obvious examples to consider. Let's look at some less obvious ones, and a reason not to mock as well
-
Don't mock the filesystem. A great explanation why can be found at Moshe Zadka's blog. The filesystem is huge. Really huge. Trying to mock it out would be a huge awful mock with a lot of edges to replace. If you need a filehandle, make a fake one using
StringIO
, or just write to a temporary file usingtempfile
. Both are standard library and easy to implement. - Acceptance tests are how we ensure our code works with the rest of enterprise imports and third-party libraries. Don't ignore that these libraries could change in unexpected ways. Write your unit tests, but write some acceptance tests for occasional runs
- Integration tests by definition are there to test how your code integrates with the rest of the environment. You will want to occasionally test the real API call at some point.
- Mock slow libraries to speed up your unit tests. Some code is expensive to instantiate, but doesn't really need to come up fully for a test. Abstract this away.
- If you're writing tests, you're doing it right!
Top comments (2)
Great post!
Great post! Helped me a lot!