loading...
Cover image for Lessons learned from writing my first python package

Lessons learned from writing my first python package

tobhai profile image Tobias Haindl ・4 min read

A short summary of lessons I learned from writing my first python packages: taggercore and taggercli.
More infos about them can be found here:

Dependency management

Dependency managemant is crucial for any project.
I could not find a very satisfying solution for separating package dependencies from dev dependencies until I stumbled across this stackoverflow post.
I specified all package dependencies in the setup.py.
The setup method provides an optional parameter called extras_require. This parameter can help with separating package from dev dependencies.

setup.py

setup(
    name="somepackage",
    version="1.0.0",
    install_requires=["boto3"],
    extras_require={"dev": ["pytest", "tox", "pytest-cov", "pytest-mock", "black"]}
)

As you can see above, I specified pytest, tox, black and two pytest plugins as my dev extras.
Now you can simple install the package dependencies including the dev extras by running pip install -e .[dev](or in zsh: pip install -e ".[dev]").
If you do not need the dev dependencies run pip install -e .

Tox

extras_require shows its full power when combined with tox.
You only need to specify the key (e.g. "dev") of the extra you want to install. Tox makes sure to install all the dependencies in the environment.

tox.ini

[tox]
envlist = py38

[testenv]
extras = dev
commands = pytest 

In my opinion this method of dependency management is very comfortable. It was important for me to keep the package dependencies separated from the dev dependencies.
How are you managing your python dependencies?

Testing

Get to know mock assertions

At the start of the tagger project I was pretty new to writing tests in python.
It took me quite a bit of time to understand the power of python mock objects.
There are many useful assertions provided by the mock object:

- assert_called
- assert_called_with_once
- assert_any_call
...

For a full list see the documentation.

If you need to assert multiple calls you can use:
assert_has_calls
and pass in an array of unittest.mock.call objects e.g.:

taggercore/.../test_region_tagger/test_should_split_resources

# shortened for readability
mocked_init_client.return_value.tag_resources.assert_has_calls(
            [
                call(
                    ResourceARNList=[
                        "some-arn-1",
                        "some-arn-2"
                    ],
                    Tags=expected_tags
                ),
                call(
                    ResourceARNList=[
                          "some-arn-3",
                          "some-arn-4"
                    ],
                    Tags=expected_tags
                )
            ],
            any_order=True
)

Understand how patching works

Patching
Patching allows you to manipulate behaviour of an object at runtime. It is a very powerful way to mock out dependencies during testing.

If you are coming from a Java background, like me, you might know the handy mockito library which can be used to create mock objects.
In python this can be achieved with the patch method. However it is a bit tricky to get the patch target right.
I can highly recommend this pycon talk about Demystifying the Patch Function.
The gist: "patch where the object is used - not where it is created"
Let me clarify this statement with an example:

The taggercli package uses classes and functions created in taggercore.
In taggercli/commands/tag you can find:
from taggercore.usecase import scan_region_and_global
How would we mock this call ?
As it is used: taggercli.commands.tag.scan_region_and_global and NOT by patching taggercore.usecase.scan_region_and_global

scan_mock = mocker.patch(
        "taggercli.commands.tag.scan_region_and_global",
        return_value=scanned_resources,
)

(mocker is a pytest fixture providing a nice API to the mock package)

If you are stuck with debugging a wrongly patched mock object, the dir() command can be helpful.
It can provide hints about how the functions and classes are accessed in your python file.

Embrace the power of pytest fixtures

Inject fixtures into tests

As you can see, especially in taggerlambda/test/test_lambda.py, I made extensive use of pytest-fixtures.
I used it to provide simple "fake" data:

taggerlambda/test/test_lambda.py

@pytest.fixture(scope="module")
def tagging_result(regional_resources, global_resources) -> TaggingResult:
    successful_arns = [resource.arn for resource in regional_resources] + [
        resource.arn for resource in global_resources
    ]
    yield TaggingResult(successful_arns, {})

Now the fixture can be used in tests by adding its name to the test parameter list:

taggerlambda/test/test_lambda/test_lambda_in_tag_mode_env

def test_lambda_in_tag_mode_env(
        self,
        tagging_result # the object is injected by pytest automatically
        .....
    ):
    .....

    mocked_perform_tagging = mocker.patch("src.tagging_lambda.perform_tagging")
    mocked_perform_tagging.return_value = tagging_result

    ....

Inject fixtures into fixtures

Do you need a more complex object in your tests consisting of other subobjects?
This can easily be done by reusing a fixture in another fixture.
Take a look at the following example:
Each resource objects can have a list of tag objects:

taggerlambda/test/test_lambda.py

@pytest.fixture(scope="module")
def tags() -> List[Tag]:
    yield [
        Tag("Project", "CoolProject"),
        Tag("Owner", "Fritz")
    ]

The tag fixture can then be reused in another fixture:

@pytest.fixture(scope="module")
def regional_resources(tags) -> List[Resource]:
    yield [
        Resource("some-arn-1", "someq", "queue", tags),
        Resource("some-arn-2", "someq2", "queue", tags)
    ]

Manipulate env variables

If you are using environment variables in the code under test, you can easily provide a fully configured environment via pytest-fixture.

@pytest.fixture(scope="function")
def env_for_tag_mode_env(monkeypatch):
    config = {
        "ACCOUNT_ID": "111111111111",
        "TAG_MODE": "ENV"
    }
    monkeypatch.setenv("ACCOUNT_ID", config["ACCOUNT_ID"])
    monkeypatch.setenv("TAG_MODE", config["TAG_MODE"])
    yield config

I'm yielding the config dict to let the assertion part of the test code know how the environment currently is configured.

The monkeypatch fixture is included in pytest and comes with useful methods like setenv and delenv


That's all from me for now!
Thanks for reading my article and I hope you can profit from my summary :)

Discussion

pic
Editor guide