I like Python, but wish it had static typing. The added safety would go a long way to improving quality and reducing development time. So today I tried to make use of type annotations and a static type-checker called mypy.
After a few basic tests, I was excited. But my glee turned to disappointment rather quickly. There are two fundamental issues that make it an unusable solution.
- You can’t have self-referential classes in the type annotations, thus no containers
- You can’t have inferred return type values, thus requiring extensive wasteful annotations
Let’s look at both problems.
Self-Referential
I’m trying to render some articles for Interview.Codes with my MDL processor. The parser uses a Node
class to create a parse tree. This class contains Node
children as part of a tree structure. Logically, that means I’d have functions like below.
class Node(Object):
def add_sub( self, sub : Node ):
...
def get_subs( self ) -> Sequence[Node]:
...
mypy
has no trouble understanding this, but it’s unfortunately not valid Python code. You can’t refer to Node
within the Node
class.
The workaround suggested is using a TypeVar
.
NodeT = TypeVar( `NodeT`, bound=`Node` )
class Node(Object):
def add_sub( self, sub : NodeT ):
...
def get_subs( self ) -> Sequence[NodeT]:
...
This is ugly. I’m reminded of C++’s _t
pattern. Part of my attraction to Python is the simplified syntax. Having to decorate classes like this makes it far less appealing. Plus, it’s boiler-plate code adding overhead for later understanding.
The limitation in Python comes from Node
not yet being in the symbol table. It doesn’t make it into the symbol table until after the class is processed, meaning you can’t use Node
within the class. This is a limitation of the compiler. There’s no reason this needs to be this way, except perhaps for backwards compatibility with screwy old code.
Perhaps we can’t use the class name. But we could have a Self
or Class
symbol that refers to the enclosing class.
No Inferred Return Types
One of the great values of Python is not having to put types everywhere. You can write functions like below.
def get_value():
return 123
Now, if you’re using TypeScript or C++ the compiler can happily infer the return type of functions. For unknown reasons, mypy
choses not to infer the return types of functions. Instead, if there is no type annotation it assumes it returns type Any
.
This means I must annotate all functions with information the static type checker already knows. It’s redundant and messy.
You’re additionally forced to learn the names and structure of all types. Ones you could otherwise safely ignore.
def get_iter():
return iter(sequence)
def get_closure(self):
return lamba q : self.op(q)
Why should I have to know the type that iter
returns to write this function? Or do you have any idea what type get_closure
returns? I know how to use the return, and can even reason it’s a function, but I’d have no idea how to specify its type. Knowing the myriad of types isn’t feasible. You’ll end up spending more time trying to tweak types than using the code.
This complexity helped drive the introduction of the auto
keyword to C++. There are many situations where writing the type information isn’t workable. This is especially true when dealing with parametric container classes,
Inferring return types is an essential feature.
Avoiding it for now
These two problems repeat throughout my codebase. I’m okay when there’s a limitation that occasionally affects the code, but this is fundamental. To use type checking, I’d have to add the redundant class declarations to every container-like class. To use type checking at all, I’d have to annotate the return value of all functions.
Static type checking should not be a tradeoff and there’s no fundamental reason these limitations can’t be lifted. When these are fixed, I’ll happily come back and use type annotations.
Image Credit: Mari Carmen
Oldest comments (10)
I'm not familiar with
mypy
, but the self-referential problem can be solved with PEP 563's, which won't be needed in Python 4.
I will check this out, since it'd shift the balance in the favour of sticking with Python (my head is running towards C++, perhaps foolishly).
Adding on to @victorosilva 's comment, the
annotation
import is only available on Python 3.7+. For Python 3.6 and below, the workaround to the self-referential issue is to put the class name in single quotes, as discussed here:Oh, I guess I'll try this approach first, since Ubuntu is on Python 3.6.8.
I also highly recommend
conda
environments, it is really nice to have per-project Python version and dependency specifications without messing with system Python.Python has venv so you can just run
python -m venv {path}
.Thanks for sharing this.
I've been meaning to experiment with python type annotations for a while but, based on the pain you're experiencing, I'll stay away for a while longer.
Thanks for the detailed article and the nice examples - it allows for an interesting discussion.
Here's my take on the subject (disclaimer: I'm using mypy in all my "serious" projects for a little bit than a year now)
First, maybe you missed the fact that
mypy
has areveal_type
function precisely so that you don't have to remember and or guess all the types.I know, this is a bit awkward to use, but it's there if you need it.
Second, I'm not quite sure why
mypy
does not infer return types - maybe it's a bug, maybe it's a design decision. The question came up for Rust by the way and I think the rationale may apply to Python too.Finally, if you're still wondering whether to give
mypy
a go, I have compiled a real-world list of changes caused by the transition to static typing in this rather long blog post. You may find it interesting :)Cheers!
Thank you.
I'm going to give it another go soon. The issue about self-referential types is essentially fixed in an upcoming Python release, and I can import from future now.
The return types are still annoying, but the reveal types will help. I got another comment that mypy may add an option to allow inferred return types.
At the moment, doing a refactoring, I'm leaning towards typing all the return types is the lesser of evils -- when compared to hunting down type mismatches.
I haven't used it yet, but MonkeyType could help with that repetitive return value typing.