DEV Community

Cover image for Python Type Annotations (part 2)
Cristina Ferrer for Cardamom Code

Posted on with Dag Brattli • Originally published at cardamomcode.dev

Python Type Annotations (part 2)

Python Type Annotations is a tutorial in 3 parts: Part 1 | Part 2 (this post) | Part 3

Table of contents


Generics

Generics provide a powerful way to create flexible and reusable code components that can work with multiple data types while maintaining type safety.

They are a way to defer the specification of types until you actually need to use them in your code. In other words, generics allow you to write a function or a class that can work with any data type.

If Python is a dynamically typed language, and already works with any type, why do we need generics?

Sometimes you want functions to handle different types in a consistent way. Consistent here means that we are locking the types of two or more type parameters. If we set the type of one parameter to be int, all other uses of that type parameter must also be int. It's like telling Python, "Hey, whatever type you choose, stick with it throughout this function!"

For Python 3.12 and later versions, generic type parameters can be defined by using square brackets after the name of the function, for example [T]. This type parameter can then be used both in the function signature, and in the body of the function.

def identity[T](value: T) -> T:
    return value
Enter fullscreen mode Exit fullscreen mode

This generic identity function locks the return type T to be the same as the input value type T. If we pass an int, the return type will then also have to be an int. Similarly, if we pass a str, the return type will be a str.

We can now use the function with any type such as int, str, etc:

a: int = identity(10)
b: str = identity("test")
Enter fullscreen mode Exit fullscreen mode

Note: Before Python 3.12, generic type variables were defined using TypeVar from the typing module.

Similar to private variables in a Python module, we usually prepend a _ to the name of the type variable to indicate that it is private and not to be used outside the module.

from typing import TypeVar  # noqa: E402

_T = TypeVar("_T")


def identity_(value: _T) -> _T:
    return value
Enter fullscreen mode Exit fullscreen mode

In the next example, we create a filter function that filters a list of a given data type [T]. The function takes two arguments: a predicate function that filters the list, and the source list to be filtered:

from collections.abc import Callable  # noqa: E402
from typing import reveal_type  # noqa: E402


def filter[T](predicate: Callable[[T], bool], source: list[T]) -> list[T]:
    return [x for x in source if predicate(x)]
Enter fullscreen mode Exit fullscreen mode

If we pass in a list[int], it will expect a predicate function that takes an int and returns a bool. The result of the the function will also be a list[int]:

xs: list[int] = [1, 2, 3]
ys = filter(lambda x: x > 2, xs)
reveal_type(ys)  # Type of "ys" is "list[int]"
Enter fullscreen mode Exit fullscreen mode

Similarly, if we pass in a list[str], the predicate function will take a str and return a bool. The result of the function will also be a list[str]:

xs_: list[str] = ["a", "b", "c"]
ys_ = filter(lambda x: len(x) > 1, xs_)
reveal_type(ys_)  # Type of "ys_" is "list[str]"
Enter fullscreen mode Exit fullscreen mode

Note that generic types are only really useful when the declared type parameters are used more than once. In the function below, the type parameter T is only used once, so the type is not locked to any other argument, or to the return type in the function.

def length[T](xs: list[T]) -> int:
    return sum(1 for _ in xs)
Enter fullscreen mode Exit fullscreen mode

When a type variable is used only once, it is equivalent to using Any, as the function will accept any type. In such cases, the type checker cannot provide any meaningful validation.

from typing import Any  # noqa: E402


def length_(xs: list[Any]) -> int:
    return sum(1 for _ in xs)
Enter fullscreen mode Exit fullscreen mode

Generic classes

We can also create generic classes by using class type parameters. This will make the class reusable with different types. In the following example, we create a Stack class that can be used with different types of data.

class Stack[T]:
    def __init__(self, initial_values: list[T] | None = None) -> None:
        self.items: list[T] = initial_values or []

    def push(self, item: T) -> None:
        self.items.append(item)

    def pop(self) -> T:
        return self.items.pop()


int_stack = Stack([1, 2, 3])
int_stack.push(1)
int_stack.push(2)

str_stack = Stack[str]()
str_stack.push("hello")
str_stack.push("world")
Enter fullscreen mode Exit fullscreen mode

Note: Before Python 3.12, generic types required to define a TypeVar and use Generic[_T] to declare a generic type of _T. This approach is still valid but for most cases not needed anymore. The exception is if we want to specify variance, which is not possible with the new syntax. We will explore variance in Part 3.

from typing import Generic  # noqa: E402

_D = TypeVar("_D")


class Stack_(Generic[_D]):
    def __init__(self, initial_values: list[_D] | None = None) -> None:
        self.items: list[_D] = initial_values or []

    def push(self, item: _D) -> None:
        self.items.append(item)

    def pop(self) -> _D:
        return self.items.pop()
Enter fullscreen mode Exit fullscreen mode

Generic Type Aliases

Type aliases in Python provide a way to create alternate names for complex types, making your code more readable and maintainable. It reduces redundancy and also adds semantic meaning to your type annotations.

In Python 3.12, you can use the new type keyword to define aliases:

type UserScores = dict[str, list[int]]
Enter fullscreen mode Exit fullscreen mode

We can make type aliases generic by using square brackets after the name of the type alias similar to how we make generic functions and classes.

type PropertyBag[T] = dict[str, T]

# We can now use the generic type alias to create a more specific non-generic type
type StringBag = PropertyBag[str]
Enter fullscreen mode Exit fullscreen mode

Note: Before python 3.12, type aliases were defined using the TypeAlias feature from the typing module.

from typing import TypeAlias, TypeVar, override  # noqa: E402

_B = TypeVar("_B")

PropertyBag_: TypeAlias = dict[str, _B]  # noqa: UP040

StringBag_: TypeAlias = PropertyBag_[str]  # noqa: UP040
Enter fullscreen mode Exit fullscreen mode

How to restrict Generic Type Parameters

Generic types can be restricted to work with specific types or their subclasses:

  • Using a bound type parameter: It sets an upper bound for the type. This means the type can only be the specified type or a subclass of the type. For example, to create a generic type that accepts float or any subclass of float (like int), you can use a bounded type parameter [T: float].

  • Using a constrained type parameter: It limits the type to a specific set of two or more types, without including their subclasses. For example, to create a generic type that accepts only int or str, you can use a constrained type parameter: [T: (int, str)].


Note: Before Python 3.12, bound and constrained generic types were defined in the TypeVar definition.

from typing import TypeVar  # noqa: E402

# Bound type parameter
T = TypeVar("T", bound=float)

# Constrained type parameter
S = TypeVar("S", int, str)
Enter fullscreen mode Exit fullscreen mode

Why to restrict Generic Type Parameters

At first, we might think that we could use unions, overloads, or base classes to achieve the same functionality as restricted generic type parameters. However, they will not work the same as restricted generic type parameters.

To explore this, let's define a data model for some animals and their subclasses:

class Animal:
    """A base class for animals"""

    def __init__(self, name: str) -> None:
        self.name = name

    def feed(self) -> str:
        return "Animal is eating"


class Rabbit(Animal):
    """A subclass of Animal"""

    def run(self) -> str:
        return "Rabbit is running"

    @override
    def feed(self) -> str:
        return "Rabbit is eating"


class Fox(Animal):
    """A subclass of Animal"""

    def say(self) -> str:
        return "Ring-ding-ding-ding!"

    @override
    def feed(self) -> str:
        return "Fox is eating"


class RedFox(Fox):
    """A subclass of Fox"""

    def go_nuts(self) -> str:
        return "Going nuts"


fox = Fox("Tod")
rabbit = Rabbit("Harvey")
red_fox = RedFox("Foxy")
Enter fullscreen mode Exit fullscreen mode

In the following examples, we'll use these classes to illustrate the differences between using unions, overloads, base classes, constrained generic types and bound generic types. The RedFox subclass will highlight potential problems when working with subclasses and generics.


Note: The @override decorator is used to indicate that a method overrides a base class method. This is useful for static type checkers to verify the method actually overrides a parent method.


Using unions

For the first example, let's create a function that takes either a Fox or a Rabbit as an argument, and returns Fox or Rabbit as the output type. We will then try to call some of the methods on the returned object to see what happens.

def care_for_animal(animal: Fox | Rabbit) -> Fox | Rabbit:
    animal.feed()
    ...
    return animal


care_for_animal(rabbit).run()  # Error: Cannot access attribute "run" for class "Fox"
care_for_animal(fox).say()  # Error: Cannot access attribute "say" for class "Rabbit"
care_for_animal(red_fox).go_nuts()  # Error: Cannot access attribute "go_nuts" for class "Fox"
Enter fullscreen mode Exit fullscreen mode

In the example above, we can never be sure if the output of care_for_animal will be a Fox or a Rabbit. Therefore, we need to use type narrowing, such as a match statement, to check the type of the result. This creates a lot of noise to our code by requiring type checks before using the result.

result = care_for_animal(rabbit)
reveal_type(result)  # Type of "result" is "Rabbit | Fox"

match result:
    case Rabbit():
        result.run()
    case Fox():
        result.say()
Enter fullscreen mode Exit fullscreen mode

As an alternative, we can try using overloads or a constrained generic type parameter.

Using overloads

Function overloads allow you to specify multiple type signatures for a single function. They're especially useful when a function can accept different parameter types or combinations of parameters. Overloads are explained in more detail in the next section.

Let's define different signatures for the care_for_animal function:

from typing import overload  # noqa: E402


@overload
def care_for_animal_overloads(animal: Rabbit) -> Rabbit: ...


@overload
def care_for_animal_overloads(animal: Fox) -> Fox: ...


def care_for_animal_overloads(animal: Fox | Rabbit) -> Fox | Rabbit:
    animal.feed()
    ...
    return animal


care_for_animal_overloads(rabbit).run()
care_for_animal_overloads(fox).say()
care_for_animal_overloads(red_fox).go_nuts()  # Error: Cannot access attribute "go_nuts" for class "Fox"
Enter fullscreen mode Exit fullscreen mode

Using overloads is more verbose, but when we call the function with a Rabbit or a Fox, it returns a Rabbit or Fox respectively, allowing us to call their specific methods. However, subclasses like RedFox are treated as their parent class Fox, so we cannot call the go_nuts method as it's only available on the RedFox class.

To address these limitations, we can explore using a constrained generic type.

Using a constrained generic type

Let's make T a constrained type parameter that can only be Rabbit or Fox.

def care_for_animal_constrained[T: (Rabbit, Fox)](animal: T) -> T:
    animal.feed()
    ...
    return animal


care_for_animal_constrained(rabbit).run()
care_for_animal_constrained(fox).say()
care_for_animal_constrained(red_fox).go_nuts()  # Error: Cannot access attribute "go_nuts" for class "Fox"
Enter fullscreen mode Exit fullscreen mode

This approach is less verbose, but it has the same limitation as using overloads. It requires knowing all restricted types upfront and won't work with other animals or even subclasses of the specified animal types. We still cannot call the go_nuts method on the red_fox object because the returned type is Fox and not RedFox.

To address this problem we need to use subclassing of some sort.

Using a base class

Let's try to use the Animal base class as the input type parameter instead of Rabbit and Fox.

def care_for_animal_subclassing(animal: Animal) -> Animal:
    animal.feed()
    ...
    return animal


care_for_animal_subclassing(rabbit).feed()

care_for_animal_subclassing(rabbit).run()  # Error: Cannot access attribute "run" for class "Animal"
care_for_animal_subclassing(fox).say()  # Error: Cannot access attribute "say" for class "Animal"
care_for_animal_subclassing(red_fox).go_nuts()  # Error: Cannot access attribute "go_nuts" for class "Animal"
Enter fullscreen mode Exit fullscreen mode

With subclassing in itself we will only be able to feed the animals, calling methods that are only available on the subclasses themselves will not work.

Let's try to use a bound generic type parameter instead.

Using a bound generic type parameter

Bound generic type parameters work with subclasses, so let's try to use a bound generic type parameter with the base class Animal.

def care_for_animal_bound[T: Animal](animal: T) -> T:
    animal.feed()
    ...
    return animal


care_for_animal_bound(rabbit).run()
care_for_animal_bound(fox).say()
care_for_animal_bound(red_fox).go_nuts()
Enter fullscreen mode Exit fullscreen mode

Now the output type matches the input type!

Subclassing with a bound generic parameter solves most of the problems we had earlier. However, it still limits us to using the function only with subclasses of Animal, making it less flexible for other types of animals.

Let's try using a protocol as the bound generic type.

Using a bound generic protocol

In Part 1, we explored protocols, which allow us to define a set of methods that a class must implement without requiring inheritance from a base class.

We can create a protocol that requires the feed method to be implemented, and use this protocol as a bound generic type parameter.

from typing import Protocol  # noqa: E402


class CareFor(Protocol):
    def feed(self) -> str: ...


def care_for_animal_protocol[T: CareFor](animal: T) -> T:
    animal.feed()
    return animal


care_for_animal_protocol(rabbit).run()
care_for_animal_protocol(fox)
Enter fullscreen mode Exit fullscreen mode

The care_for_animal_protocol function supports now static duck typing. It will work with any type that implements the feed method, allowing us to use it with any animal, not just subclasses of Animal:

class Dog:
    def feed(self) -> str:
        return "Dog is eating"


care_for_animal_protocol(Dog()).feed()

care_for_animal_protocol(rabbit).run()
care_for_animal_protocol(fox).say()
care_for_animal_protocol(red_fox).go_nuts()
Enter fullscreen mode Exit fullscreen mode

Using a class that does not implement the feed method will result in a type error:

class Robot: ...


care_for_animal_protocol(Robot())  # Error: "Robot" is incompatible with protocol "CareFor"
Enter fullscreen mode Exit fullscreen mode

More about bound generic protocols

Bound generic parameters combined with protocols are particularly useful when we need to ensure that a generic function or method works with a range of compatible types that share certain characteristics or behaviors, without requiring those types to inherit from a common base class.

Here is another example. We can define a generic function that takes a list of values and returns the largest value. This function can work with any type that implements the __gt__ method, such as int, float, or custom classes that define the __gt__ method.

To achieve this, we will need to create a protocol that requires a type to be comparable using the greater-than operator >. This protocol will then be used to bound the generic type T used in the function:

from collections.abc import Iterable  # noqa: E402
from typing import Any, reveal_type  # noqa: E402


class IsGreaterThan(Protocol):
    def __gt__(self, value: Any, /) -> bool: ...


def largest[T: IsGreaterThan](xs: Iterable[T]) -> T:
    return max(xs)


xs: list[int] = [1, 2, 3]
result = largest(xs)
reveal_type(result)  # Type of "result" is "int"

ss: list[str] = ["a", "b", "c"]
result = largest(ss)
reveal_type(result)  # Type of "result" is "str"
Enter fullscreen mode Exit fullscreen mode

We can also use the IsGreaterThan Protocol for a function that takes a list of comparable values. We can pass in either a list of int or a list of str. The return type will correspond to the input type, returning a list of int or str, respectively.

def sort_list[T: IsGreaterThan](items: list[T]) -> list[T]:
    return sorted(items)


numbers = [3, 1, 4, 1, 5, 9]
sorted_numbers = sort_list(numbers)
reveal_type(sorted_numbers)  # Type of "sorted_numbers" is "list[int]"

names = ["Alice", "Bob", "Charlie"]
sorted_names = sort_list(names)
reveal_type(sorted_names)  # Type of "sorted_names" is "list[str]"
Enter fullscreen mode Exit fullscreen mode

Variadic Generics

In the previous section about Generics, we looked at how we can define generic classes using one or more type variables, such as T and U. However, in some cases, we might not know the number of type variables before the class instance or function is created.

Variadic generics allow us to define classes and functions that can take any number of type arguments, similar to tuple arguments such as *args, where both the number of elements and their types depend on the function caller.

Let's explore how we can use variadic generics in classes, functions and callables.

Classes with Variadic Generics

Variadic generics uses the star * notation for type variables, such as *Ts, to indicate that the type variable can represent any number of type arguments. The lowercase s is a convention used to indicate that the type variable is plural, showing that it represents multiple type arguments, compared to the commonly used T that represents a single type argument.

In the following example, we create two instances of DataPoint, each with a different number of arguments and types.

from typing import reveal_type


class DataPoint[*Ts]:
    def __init__(self, *values: *Ts) -> None:
        self.values = values


first = DataPoint(True)
reveal_type(first)  # Type of "first" is "DataPoint[bool]"

second = DataPoint(1, "test", 2.3)
reveal_type(second)  # Type of "second" is "DataPoint[int, str, float]"
Enter fullscreen mode Exit fullscreen mode

--- Note: Before Python 3.12, variadic generics were defined using TypeVarTuple instead of using the *Ts syntax. It needs to be imported from the typing module for Python 3.11, and from typing_extensions for versions earlier than 3.11:

from typing import TypeVarTuple  # noqa: E402

_Ts = TypeVarTuple("_Ts")  # Only needed for Python 3.11 and earlier
Enter fullscreen mode Exit fullscreen mode

Functions with Variadic Generics

Variadic generics can also be used to define functions that can take any number of arguments and types. This enables us to statically type the so-called *args argument.

Similarly to the previous example, here we define the generic make_datapoint function, which can take any number of arguments and types. The types are inferred from the provided arguments:

def make_datapoint[*Ts](*values: *Ts) -> DataPoint[*Ts]:
    return DataPoint(*values)


point1 = make_datapoint("temperature", 25.5)
point2 = make_datapoint(1, "test", 2.3)

reveal_type(point1)  # Type of "point1" is "DataPoint[int, str, float]"
reveal_type(point2)  # Type of "point2" is "DataPoint[str, float]"
Enter fullscreen mode Exit fullscreen mode

Let's take a look at another example. We can define a head function that can take any number of arguments and types and returns the first element of the tuple. This is achieved by defining the type of the tuple argument as tuple[T, *Ts] where T is the first element and Ts the rest of the elements.

def head[T, *Ts](tup: tuple[T, *Ts]) -> T:
    return tup[0]


x = head((1, "test", 2.3))
reveal_type(x)  # Type of "x" is "int"
Enter fullscreen mode Exit fullscreen mode

Similarly, we can define a tail function that can take any number of arguments and types and return all of them except for the first one.

def tail[T, *Ts](tup: tuple[T, *Ts]) -> tuple[*Ts]:
    return tup[1:]


y = tail((1, "test", 2.3))
reveal_type(y)  # Type of "y" is "tuple[str, float]"
Enter fullscreen mode Exit fullscreen mode

Callables with Variadic Generics

Variadic generics can also be used to annotate callables.

In this example, we define a function apply that uses variadic generics to accept a callable fn and a variable number of arguments of type Ts. The apply function then calls the provided callable with the given arguments and returns the result.

from collections.abc import Callable  # noqa: E402


def apply[*Ts](fn: Callable[[*Ts], int], *args: *Ts) -> int:
    return fn(*args)


a = apply(lambda x, y: x + y, 10, 10)
Enter fullscreen mode Exit fullscreen mode

In the next section, we will also see how to use Parameter Specification (ParamSpec) to define both *args and **kwargs.

Parameter Specification

While variadic generics are useful for handling a variable number of type arguments, they are limited to only handling positional arguments. Parameter Specification is a special way of annotating callables that captures both positional and keyword arguments. It is specifically designed to be used with decorators, allowing us to forward argument types between functions and decorators. We will explore this and other uses in this section.

Type Annotating Decorators

Before parameter specification became available, it was impossible to type-annotate decorators accurately. This is because the decorator function needs to forward the same type of parameters as the function it decorates, which can be positional, named, or both.

In the following example, we define a simple decorator logging_before_paramspec that logs the function call details before executing the function. However, this decorator does not maintain the types of the original function it decorates. Instead, it uses Callable[..., Any] for both the input and output types, which means it accepts and returns any callable. As a result, the type information of the original function is lost.

from collections.abc import Callable
from typing import Any, reveal_type


def logging_before_paramspec(func: Callable[..., Any]) -> Callable[..., Any]:
    def wrapper(*args: Any, **kwargs: Any) -> Any:
        print(f"Calling {func.__name__} with {args} and {kwargs}")
        return func(*args, **kwargs)

    return wrapper


@logging_before_paramspec
def my_func_(a: int, b: str) -> float:
    return a + float(b)


reveal_type(my_func_)  # Type of "my_func_" is "(...) -> Any"
Enter fullscreen mode Exit fullscreen mode

Now let's make use of ParamSpec to annotate the decorator, ensuring that the type information of the original function is preserved:

def logging[**P, R](func: Callable[P, R]) -> Callable[P, R]:
    def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
        print(f"Calling {func.__name__} with {args} and {kwargs}")
        return func(*args, **kwargs)

    return wrapper


@logging
def my_func(a: int, b: str) -> float:
    return a + float(b)


reveal_type(my_func)  # Type of "my_func" is "(a: int, b: str) -> float
Enter fullscreen mode Exit fullscreen mode

ParamSpec is unique because it's not used like a regular type parameter. Instead, we declare a generic type parameter with double star e.g. **P. The use of P within a function signature Callable[P, R] will capture all of the arguments, both positional and keyword, of a callable function. You can then use P in another function, or use P.args and P.kwargs to refer to positional arguments or keyword arguments.


Note: ParamSpec became available in Python 3.10. It's still possible to use ParamSpec in earlier versions, such as 3.9, by importing it from typing_extensions.

from typing_extensions import ParamSpec  # noqa: E402

_P = ParamSpec("_P")
Enter fullscreen mode Exit fullscreen mode

Concatenating Type Parameters

Parameter specification can also be used to concatenate type parameters. This is useful when you want to define a function that takes a variable number of arguments and need to add or remove some of the positional arguments between the functions.

In the following example, we use the Concatenate type to concatenate the type parameters of the decorated function with the type parameter of the decorator. This allows us to add one positional argument between the two, enabling us to inject a default value for the format argument of the date2str function.

from datetime import datetime  # noqa: E402
from typing import Any, Concatenate  # noqa: E402


def inject[T, **P, R](__arg: T) -> Callable[[Callable[Concatenate[T, P], R]], Callable[P, R]]:
    def decorator(func: Callable[Concatenate[T, P], R]) -> Callable[P, R]:
        def wrapper(*args: P.args, **kwargs: P.kwargs) -> R:
            return func(__arg, *args, **kwargs)

        return wrapper

    return decorator


@inject("%Y-%m-%d")
def date2str(format: str, dt: datetime) -> str:
    return dt.strftime(format)


dt = date2str(datetime.now())  # (function) def date2str(dt: datetime) -> str
print(dt)  # 2024-07-24
Enter fullscreen mode Exit fullscreen mode

The inject decorator is defined with the type variable T, which represents the type of the argument to be injected (like a str format), the parameter specification **P which captures the original function's variable arguments, and the type variable R which defines the decorated function's return type.

The decorator function takes a callable with concatenated type parameters Concatenate[T, P] and returns a new function that only requires the original P arguments while preserving the original return type R.

The wrapper function inside the decorator adds the injected argument __arg as the first positional argument when calling the original function.

It's not often used in practice, but it's still a useful feature. Check out overloads for the curry_flip in the Expression library for a more practical example.

Overloads

Overloads are useful when you need to define functions that can take different combinations of arguments and perhaps different return types depending on the combinations of arguments.

We can define multiple type signatures by using the @overload decorator from the typing module.

Only the main function should contain the implementation and should be type annotated accepting all cases (often using Any). The overloads should be specific and should not overlap.

In the following example, we define a pipe function that takes a value and up to three positional arguments, each of which is a callable. Each callable takes the output of the previous one as its input. The return type of the pipe function is the result of applying all the callables to the initial value.

from collections.abc import Callable
from functools import reduce
from typing import Any, overload


@overload
def pipe[_A](value: _A) -> _A: ...


@overload
def pipe[_A, _B](value: _A, fn: Callable[[_A], _B], /) -> _B: ...


@overload
def pipe[_A, _B, _C](
    value: _A,
    fn1: Callable[[_A], _B],
    fn2: Callable[[_B], _C],
    /,
) -> _C: ...


@overload
def pipe[_A, _B, _C, _D](
    value: _A,
    fn1: Callable[[_A], _B],
    fn2: Callable[[_B], _C],
    fn3: Callable[[_C], _D],
    /,
) -> _D: ...


def pipe(value: Any, *fns: Callable[[Any], Any]) -> Any:
    return reduce(lambda acc, fn: fn(acc), fns, value)
Enter fullscreen mode Exit fullscreen mode

Note: The overload decorated definitions are only for the benefit of the type checker. At runtime, only the main function is used. The type checker will use the overload decorated definitions to check if the function is called with the correct arguments and return types.


Let's take a look at another example. We can define different overloads for different keyword arguments. In this case, we can use * to explicitly define the end of positional arguments.

@overload
def keyword_arguments(value: int, *, name: str, age: int) -> str: ...


@overload
def keyword_arguments(value: int, *, city: str, location: int) -> str: ...


def keyword_arguments(value: int, **kwargs: Any) -> Any:
    """This function is used to demonstrate the use of keyword arguments."""
Enter fullscreen mode Exit fullscreen mode

Why not use union types instead of overloads?

While it might be tempting to use Union types instead of overloads, it can be challenging to get it right. For example, if we have a function with two arguments and each argument can be of different types, Union types cannot specify which combinations of types are valid. This adds the need to validate the arguments before using them.

When using overloads, the type checker will catch invalid combinations of arguments, eliminating the need for manual validation.

Let's demonstrate this with an example. We'll define a function that takes two arguments and returns their sum. First, we'll annotate the function using Union types.

from typing import reveal_type  # noqa: E402


def process_with_union(arg1: int | str, arg2: int | str) -> int | str:
    match arg1, arg2:
        case int(), int():
            return arg1 + arg2
        case str(), str():
            return arg1 + arg2
        case _:
            raise TypeError("Invalid arguments")
Enter fullscreen mode Exit fullscreen mode

There are several issues with the example above. When calling the function we can never be sure if the return type will be int or str:

ret1 = process_with_union(10, 10)
reveal_type(ret1)  # Type of "ret1" is "int | str"

ret2 = process_with_union("hello", " world")
reveal_type(ret2)  # Type of "ret2" is "int | str"
Enter fullscreen mode Exit fullscreen mode

This means that if you try to use the return type directly, the type checker will give an error. To resolve this, you will need to use type narrowing before using the return values:

ret3 = ret1 + 10  # Error: Operator "+" not supported for types "int | str" and "Literal[10]"
ret4 = ret2 + "!"  # Error: Operator "+" not supported for types "int | str" and "Literal['!']"

if isinstance(ret1, int):
    ret5 = ret1 + 10
    reveal_type(ret5)  # Type of "ret5" is "int"

if isinstance(ret2, str):
    ret6 = ret2 + "!"
    reveal_type(ret6)  # Type of "ret6" is "str"
Enter fullscreen mode Exit fullscreen mode

The type checker will also not warn us if we call the function with an invalid combination of arguments. We will only get a TypeError at runtime.

process_with_union("hello", 10)  # type checker does not warn of an error
Enter fullscreen mode Exit fullscreen mode

Let's instead try to use overloads:

@overload
def process_with_overloads(arg1: int, arg2: int) -> int:
    """Process two integers"""


@overload
def process_with_overloads(arg1: str, arg2: str) -> str:
    """Process two strings"""


def process_with_overloads(arg1: Any, arg2: Any) -> Any:
    """Main function"""
    return arg1 + arg2
Enter fullscreen mode Exit fullscreen mode

Now, we get a return type from the function that we can use directly without needing to narrow the type beforehand.

ret7: int = process_with_overloads(10, 10)
reveal_type(ret7)  # Type of "ret3" is "int

ret8 = process_with_overloads("hello", " world")
reveal_type(ret8)  # Type of "ret4" is "str"

ret7 = ret7 + 10
ret8 = ret8 + "!"
Enter fullscreen mode Exit fullscreen mode

And we also get a type error when calling the function with an invalid combination of arguments.

process_with_overloads("hello", 10)  # Error: No overloads for "process_with_overloads" match the provided arguments
Enter fullscreen mode Exit fullscreen mode

--- Note: As we can see in the above example, docstrings can also be added to the overloads instead of the main function to document each overload. If you don't document each overload then the VSCode Python Extension will fallback to show the main function documentation instead when hovering over the function name.


Final Notes

Using type annotations will help you write better code that is easier to understand, more maintainable, easier to refactor, and easier to test. It can ensure robustness by catching potential type errors during static type checking and improving the overall coding experience. Additionally, it will make it easier for AI tools to understand your code and assist with code completion, refactoring, adding new features and finding bugs.

However, typing in Python can be a double-edged sword:

  • Updates to MyPY and Pyright will break your code with new releases and you need to take this into your budget. You will spend time fixing type errors with every new release of the type checker.

  • Many common libraries still do not support typing, e.g. pandas, so you might have to write stubs for these libraries yourself, or reduce the type checking when using these libraries by either using # type: ignore or setting type checking mode from strict to basic.

  • There will be both false positives, and false negatives. You never have this problem with languages such as Java, Scala, C#, or F#. Prefer writing simple code to reduce the number of false positives and false negatives and reduce the time you spend fixing type errors.

  • Static type checkers are not able to analyze dynamic code that changes at runtime. While mocking is useful for testing, it can potentially bypass or confuse static type checkers.

However, the benefits of using type annotations in Python far outweigh the disadvantages. It will make you a better programmer and make your code more robust and maintainable. If you want to write high quality production grade code, then you should definitely use type annotations in Python!

References

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.