mistermocha

Posted on May 10, 2017

Python unit testing with Mock - Part Two

#python #testing #mock #beginners

How do you write tests?

Python has many frameworks for writing tests, but ultimately they come down to collecting functions whose names begin with test from an organized namespace. This article, being the sequel to "Python unit testing with Mock - Part One", isn't here to help organize your namespace (for the record, that's best done with py.test which deserves its own glorious article).

This article will show you how to write tests, and understand when and how to use the mock module. It's one thing to just know "oh, that's how a mock works" but another entirely to know how to apply a mock to test your code properly. This article seeks to address that bit of knowledge.

How do I actually use mock now?

Having all the tools doesn't necessarily mean knowing how to use them! When I first read the documentation for mock, I found it baffling. I really didn't understand how to put it all together. Here's some practical-ish examples of how to read your code and decide how to craft a test.

Example #1

Let's touch upon an example similar to one we witnessed in the previous article. Here's a function that uses the requests library.

def get_example():
    r = requests.get('http://example.com/')
    return r.status_code == 200

In this function, we have an instantiated request r and view the status_code property on the object. The requests object is what talks to the outside world, but our code that implements requests is what we want to test out.

What we need to do is replace the return of requests.get() with a mock. We're going to use a variety of the features of mock to replace it properly.

@mock.patch('requests.get', autospec=True)
def test_get_example_passing(mocked_get):
   mocked_req_obj = mock.Mock()
   mocked_req_obj.status_code = 200
   mocked_get.return_value = mocked_req_obj
   assert(get_example())

   mocked_get.assert_called()
   mocked_get.assert_called_with('http://example.com/')

Let's walk through this to show what we're doing and why

mock.patch patches requests.get with a mock (meaning the get function in the requests library) using autospec=True to match the signature of requests.get.
This patched object is made available as the first argument, which we receive in mocked_get.
We know we need to create a return to feed into r in the code we're testing, so we create mocked_request_object. In the code, we only examine the status_code, so we declare that value - in this case, 200
We then assign mocked_request_object to the return_value of mocked_get, which is what will be returned when the patched function gets called
Since we know the function effectively returns boolean, we can call a simple assert on the function's return.
To ensure the mock was called properly, we call the built-in assertion methods. assert_called ensures the mocked requests.get was actually called, and assert_called_with checks the args. We can do similar introspection on the returned mock as well.

Now this seems like a lot of test code (8 lines) for a small function (3 lines). It's often the case that a python unit test could have more LOC than the code it's testing. We don't even have complete test coverage, because 200 isn't the only valid response! Test code should write faster than product code, since it doesn't involve the same level of thought or design.

This was only an example to show how to design a mock to replace an interface library. The principle here is to understand what part of your code talks to the public and do the minimum to replace it.

Example #2

It's called a "mock" because it can pretend to be something real. We know our code, so now we can examine our code to determine how to use a mock. We have a good overview of the parts and usage, but implementation is everything.

When looking at your code and deciding how to test, you want to run a test that ensure the desired outputs and side effects happen when executing your code. Ideally, you want to ensure at least one, if not multiple, tests happen for every line of code. So, tests should cover each function and mocks should prevent those functions from hitting the outside world.

Here's an example of a module that writes to a database:

class DBWriter(object):
    counter = 0

    def __init__(self):
        self.db = DBLibrary()

    def commit_to_db(self, sql):
        self.counter += 1
        self.db.commit(sql)

    def save(self, string):
        sql = "INSERT INTO mytable SET mystring = '{}'".format(string)
        self.commit_to_db(sql)

    def drop(self, string):
        sql = "DELETE FROM mytable WHERE mystring = '{}'".format(string)
        self.commit_to_db(sql)

Now this module uses a made-up API module called DBLibrary. Let's assume DBLibrary is an external library that is already trusted to work. We know that such a library has a couple features to note:

It is outside of this code
It may already have its own tests - even if it doesn't, refer to previous point
It can change state in a database

Those are all perfectly qualifying characteristics for something ripe for being mocked away. Now that said, we need to write tests for DBWriter, which is OUR library. Let's look at the behaviors of this module. The effective save and drop actions do the following:

Prepare the sql statement
Write the statement to the database
Increment the counter

All of these behaviors are elements to test, but we still want to protect the database from any writes. How do we do it?

Model #1 - Patch `commit_to_db`

The actual writing to the database happens in the commit_to_db function. We can replace this function with a mock and replicate the behavior. It does two things:

Increments the counter
Calls self.db.commit to commit to the database

Here's how you write that test:

@mock.patch('dbwriter.DBWriter.commit_to_db', autospec=True)
def test_save(mock_commit):
    writer = DBWriter()

    def fake_commit(self, sql):
        writer.counter += 1

    mock_commit.side_effect = fake_commit

    writer.save("Hello World")

    mock_commit.assert_called_with(writer,
        "INSERT INTO mytable SET mystring = 'Hello World'")
    assertEquals(writer.counter, 1)

Note the namespace assumes DBWriter is in a filename dbwriter.py or otherwise such organized module. You'll need to know your module's namespace for this patch to work.

The @mock.patch decorator replaces the commit_to_db function in place. We declare autospec=True which matches the signature to ensure commit_to_db is called correctly.

We also add a side effect, which is something that happens besides containment of a return value. commit_to_db doesn't just write to a database, but it also increments a counter. Adding this behavior in as a side effect ensures this action doesn't omitted when the mock is called. Since we're replacing the function with a mock, the actual code of that function never gets called.

Calling the mock in this way gives us the following benefits:

Introspection into commit_to_db: we can call mock's introspection tools to ensure that the function is receiving the right arguments
Insulation from write actions: We don't bring up an equivalent action to self.db.commit in the test, which means we've removed the action that can potentially write to the database.

However, this comes with some drawbacks:

Lost introspection into self.db.commit: We may be insulated, but we don't know if we're calling the function right.
No exercise of the actual commit_to_db: We're re-writing the non-state-changing code in our unit test! If that line were something more complex, or if there was a ton of code in here, we'd have to reimplement it all!

Now let's try another approach:

Model 2: Patch self.db.commit

@mock.patch('dbwriter.DBLibrary', autospec=True)
def test_save(mock_dblib):
    writer = DBWriter()
    writer.save("Hello World")
    mock_dblib.return_value.commit.assert_called_with(writer,
        "INSERT INTO mytable SET mystring = 'Hello World'")

Note that we're patching DBLibrary directly in the dbwriter namespace, which prevents patching that library elsewhere. This is direct surgery.

Doing this replaces the DBLibrary with a mock. Again, autospec=True will ensure that calls to commit and any other implemented methods will respect calls that match the given signatures of those methods.

Here's what we get for testing this way:

Insulation from the database: same as before, just in a different way
Introspection into how we use the DBWriter module: notice w're calling assert_called_with and determining how we call the function that actually writes to the database, with the same protections
Full exercise of every line of code in our test: only external code is replaced by the mock

Here's what we lose:

Introspection into the parent function: We don't see how the function is called by it's neighbors inside the module

Luckily, both tests are valid and can be done in parallel. If you're not sure which test to write, write both! You can never have too much test coverage.

Example #3

I'm often asked about how to mock something that gets instantiated multiple times in an object. A class coming up will instantiate a module twice (e.g., two separate database connections). Just naÃ¯vely patching the base module may give you one instance that gets used twice, or just spit out default mocks that don't help you at all. It took me some research and experimentation to figure out how to do this.

So, here's some example code:

from some.library import AnotherThing

class MyClass(object):
    def __init__(self, this, that):
        self.this = AnotherThing(this)
        self.that = AnotherThing(that)

    def do_this(self):
        self.this.do()

    def do_that(self):
        self.that.do()

    def do_more(self):
        got_it = self.this.get_it()
        that_too = self.that.do_it(got_it)
        return that_too

self.this and self.that are both instances of AnotherThing but initialized with this and that respectively. They're both used throughout the code. We want to patch out AnotherThing so we can decidedly test our code out.

We can take two approaches here, with some caveats. Ultimately, the options you choose have to do with how you implement the code.

Approach #1: Patch `this` and `that` directly

This approach works if you know for a fact that __init__ for MyClass and the initialization of AnotherThing didn't actively change state:

def test_my_class():
    my_obj = MyClass("fake this", "fake that")
    my_obj.this = Mock(spec_set='some.library.AnotherThing')
    my_obj.that = Mock(spec_set='some.library.AnotherThing')

    my_obj.do_this()
    my_obj.this.do.assert_called()
    my_obj.do_that()
    my_obj.that.do.assert_called()

In this approach, we instantiate and then patch the result in place. It's a straightforward approach that only really works when you know that your code will fit the pattern. It's highly unlikely that state changes happen in __init__ code, but entirely possible. It's also more likely that instantiating AnotherThing will be slow and expensive, even if it wasn't changing any external state. So this may be less confusing, it's not necessarily better.

Approach #2: Mock Generator

AnotherThing returns an instance of AnotherThing, so why not just change that action to spit out mocks?

@patch('yourcode.AnotherThing', autospec=True)
def test_my_class(mock_thing):
    def fake_init(*args):
        return Mock(*args, spec_set='some.library.AnotherThing')
    mock_thing.side_effect = fake_init

    my_obj = MyClass("fake this", "fake that")
    my_obj.this.called_with("fake this")
    my_obj.that.called_with("fake that")

mock_thing is the in-namespace patch to replace AnotherThing. We take advantage of using side_effect to return a spec_set mock. Now we've patched AnotherThing using mock.patch to spit out mocks that look like the objects it creates. All my patching happens before instantiation, so there's no modification of the code object being tested here.

Additionally, the mocks permit introspection, so we get to test usage as well!

To mock or not to mock?

There are times when it's necessary and appropriate to mock, and we've covered some examples of when it's absolutely necessary for the scope of a unit test:

Calling an API
Connecting to a database
Using the command line

Those are some obvious examples to consider. Let's look at some less obvious ones, and a reason not to mock as well

Don't mock the filesystem. A great explanation why can be found at Moshe Zadka's blog. The filesystem is huge. Really huge. Trying to mock it out would be a huge awful mock with a lot of edges to replace. If you need a filehandle, make a fake one using StringIO, or just write to a temporary file using tempfile. Both are standard library and easy to implement.
Acceptance tests are how we ensure our code works with the rest of enterprise imports and third-party libraries. Don't ignore that these libraries could change in unexpected ways. Write your unit tests, but write some acceptance tests for occasional runs
Integration tests by definition are there to test how your code integrates with the rest of the environment. You will want to occasionally test the real API call at some point.
Mock slow libraries to speed up your unit tests. Some code is expensive to instantiate, but doesn't really need to come up fully for a test. Abstract this away.
If you're writing tests, you're doing it right!