DEV Community

Cover image for A Failed Experiment with Python Type Annotations
edA‑qa mort‑ora‑y
edA‑qa mort‑ora‑y

Posted on • Originally published at mortoray.com

A Failed Experiment with Python Type Annotations

I like Python, but wish it had static typing. The added safety would go a long way to improving quality and reducing development time. So today I tried to make use of type annotations and a static type-checker called mypy.

After a few basic tests, I was excited. But my glee turned to disappointment rather quickly. There are two fundamental issues that make it an unusable solution.

  • You can’t have self-referential classes in the type annotations, thus no containers
  • You can’t have inferred return type values, thus requiring extensive wasteful annotations

Let’s look at both problems.

Self-Referential

I’m trying to render some articles for Interview.Codes with my MDL processor. The parser uses a Node class to create a parse tree. This class contains Node children as part of a tree structure. Logically, that means I’d have functions like below.

class Node(Object):

    def add_sub( self, sub : Node ):
        ...

    def get_subs( self ) -> Sequence[Node]:
        ...
Enter fullscreen mode Exit fullscreen mode

mypy has no trouble understanding this, but it’s unfortunately not valid Python code. You can’t refer to Node within the Node class.

The workaround suggested is using a TypeVar.

NodeT = TypeVar( `NodeT`, bound=`Node` )
class Node(Object):
    def add_sub( self, sub : NodeT ):
        ...

    def get_subs( self ) -> Sequence[NodeT]:
        ...
Enter fullscreen mode Exit fullscreen mode

This is ugly. I’m reminded of C++’s _t pattern. Part of my attraction to Python is the simplified syntax. Having to decorate classes like this makes it far less appealing. Plus, it’s boiler-plate code adding overhead for later understanding.

The limitation in Python comes from Node not yet being in the symbol table. It doesn’t make it into the symbol table until after the class is processed, meaning you can’t use Node within the class. This is a limitation of the compiler. There’s no reason this needs to be this way, except perhaps for backwards compatibility with screwy old code.

Perhaps we can’t use the class name. But we could have a Self or Class symbol that refers to the enclosing class.

No Inferred Return Types

One of the great values of Python is not having to put types everywhere. You can write functions like below.

def get_value():
    return 123
Enter fullscreen mode Exit fullscreen mode

Now, if you’re using TypeScript or C++ the compiler can happily infer the return type of functions. For unknown reasons, mypy choses not to infer the return types of functions. Instead, if there is no type annotation it assumes it returns type Any.

This means I must annotate all functions with information the static type checker already knows. It’s redundant and messy.

You’re additionally forced to learn the names and structure of all types. Ones you could otherwise safely ignore.

def get_iter():
    return iter(sequence)

def get_closure(self):
    return lamba q : self.op(q)
Enter fullscreen mode Exit fullscreen mode

Why should I have to know the type that iter returns to write this function? Or do you have any idea what type get_closure returns? I know how to use the return, and can even reason it’s a function, but I’d have no idea how to specify its type. Knowing the myriad of types isn’t feasible. You’ll end up spending more time trying to tweak types than using the code.

This complexity helped drive the introduction of the auto keyword to C++. There are many situations where writing the type information isn’t workable. This is especially true when dealing with parametric container classes,

Inferring return types is an essential feature.

Avoiding it for now

These two problems repeat throughout my codebase. I’m okay when there’s a limitation that occasionally affects the code, but this is fundamental. To use type checking, I’d have to add the redundant class declarations to every container-like class. To use type checking at all, I’d have to annotate the return value of all functions.

Static type checking should not be a tradeoff and there’s no fundamental reason these limitations can’t be lifted. When these are fixed, I’ll happily come back and use type annotations.


Image Credit: Mari Carmen

Top comments (10)

Collapse
 
evanoman profile image
Evan Oman • Edited

Adding on to @victorosilva 's comment, the annotation import is only available on Python 3.7+. For Python 3.6 and below, the workaround to the self-referential issue is to put the class name in single quotes, as discussed here:

class Position:
    ...
    def __add__(self, other: 'Position') -> 'Position':
       ...
Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

Oh, I guess I'll try this approach first, since Ubuntu is on Python 3.6.8.

Collapse
 
evanoman profile image
Evan Oman

I also highly recommend conda environments, it is really nice to have per-project Python version and dependency specifications without messing with system Python.

Thread Thread
 
miniscruff profile image
miniscruff

Python has venv so you can just run python -m venv {path}.

Collapse
 
victorosilva profile image
Victor Silva • Edited

I'm not familiar with mypy, but the self-referential problem can be solved with PEP 563's

from __future__ import annotations

, which won't be needed in Python 4.

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

I will check this out, since it'd shift the balance in the favour of sticking with Python (my head is running towards C++, perhaps foolishly).

Collapse
 
dmerejkowsky profile image
Dimitri Merejkowsky

Thanks for the detailed article and the nice examples - it allows for an interesting discussion.

Here's my take on the subject (disclaimer: I'm using mypy in all my "serious" projects for a little bit than a year now)

First, maybe you missed the fact that mypy has a reveal_type function precisely so that you don't have to remember and or guess all the types.


sequence = [1, 2, 3]

def get_iter():
    res = iter(sequence)
    reveal_type(res)
    return res
$ mypy --strict foo.py
foo.py:26: error: Function is missing a type annotation
foo.py:28: error: Revealed type is 'typing.Iterator[builtins.int*]'

I know, this is a bit awkward to use, but it's there if you need it.

Second, I'm not quite sure why mypy does not infer return types - maybe it's a bug, maybe it's a design decision. The question came up for Rust by the way and I think the rationale may apply to Python too.

Finally, if you're still wondering whether to give mypy a go, I have compiled a real-world list of changes caused by the transition to static typing in this rather long blog post. You may find it interesting :)

Cheers!

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

Thank you.

I'm going to give it another go soon. The issue about self-referential types is essentially fixed in an upcoming Python release, and I can import from future now.

The return types are still annoying, but the reveal types will help. I got another comment that mypy may add an option to allow inferred return types.

At the moment, doing a refactoring, I'm leaning towards typing all the return types is the lesser of evils -- when compared to hunting down type mismatches.

Collapse
 
bosepchuk profile image
Blaine Osepchuk • Edited

Thanks for sharing this.

I've been meaning to experiment with python type annotations for a while but, based on the pain you're experiencing, I'll stay away for a while longer.

Collapse
 
moritzweber profile image
Moritz Weber

I haven't used it yet, but MonkeyType could help with that repetitive return value typing.