NVIDIA's cuda-python, the official Python bindings for the CUDA toolkit, recently added automatically-generated .pyi stub files using stubgen-pyx. Their description of why:
"This allows IDE auto-completion to work (which is also used by IDE-integrated coding agents). This has also found 2 real bugs in our code already. The ability to catch a certain class of bugs with this will be really helpful going forward, especially since our linting abilities with
cython-lintare a bit behind what they are in pure Python."
When you commit .pyi stubs alongside a Cython extension, you get an artifact your normal Python linting and type-checking pipeline can analyze. Inconsistencies between what the Cython source does and what the stub claims become visible. NVIDIA found two real bugs this way before they were reported.
I'm the author of stubgen-pyx, and this post is a technical walk-through of how those stubs are produced.
The problem with existing stub generators
When you compile a Cython module, the source disappears. What you get is a .so (or .pyd) file: a compiled extension with no type information readable by a language server. Tools like mypy's stubgen can generate stubs for these by importing the compiled binary and using runtime introspection.
The results are usually disappointing. Take this typed Cython module:
"""Mathematical utilities for scientific computing."""
cdef class Matrix:
"""A simple matrix class."""
cdef int rows
cdef int cols
def __init__(self, int rows, int cols):
"""Initialize a matrix."""
self.rows = rows
self.cols = cols
def shape(self) -> tuple[int, int]:
"""Get matrix dimensions."""
return (self.rows, self.cols)
cpdef scale(self, double factor):
"""Scale all elements."""
pass
cdef int _validate(self):
"""Internal validation (not exposed)."""
return 0
def matrix_product(Matrix a, Matrix b) -> Matrix:
"""Compute matrix product."""
return Matrix(a.rows, b.cols)
Running stubgen on the compiled binary produces:
import _cython_3_2_4
matrix_product: _cython_3_2_4.cython_function_or_method
class Matrix:
def __init__(self, *args, **kwargs) -> None: ...
def scale(self, *args, **kwargs): ...
def shape(self, *args, **kwargs): ...
def __reduce__(self): ...
def __reduce_cython__(self, *args, **kwargs): ...
def __setstate_cython__(self, *args, **kwargs): ...
The type annotations you wrote on every argument don't survive into the compiled binary in a form introspection can recover. You get missing docstrings and untyped *args, **kwargs everywhere.
stubgen-pyx takes a different approach: it never touches the compiled binary. It reads the Cython source directly, parses it using Cython's own compiler internals, and extracts the type annotations and documentation the author wrote. The output for the same module:
# This file was generated by stubgen-pyx
"""Mathematical utilities for scientific computing."""
from __future__ import annotations
class Matrix:
"""A simple matrix class."""
def scale(self, factor: float):
"""Scale all elements."""
def __init__(self, rows: int, cols: int):
"""Initialize a matrix."""
def shape(self) -> tuple[int, int]:
"""Get matrix dimensions."""
def matrix_product(a: Matrix, b: Matrix) -> Matrix:
"""Compute matrix product."""
Using it
pip install stubgen-pyx
stubgen-pyx ./your_package
Or programmatically:
from stubgen_pyx import StubgenPyx
from stubgen_pyx.config import StubgenPyxConfig
config = StubgenPyxConfig(
continue_on_error=True, # don't abort on one bad file
include_private=True, # include _private functions
)
stubgen = StubgenPyx(config=config)
results = stubgen.convert_glob("src/**/*.pyx")
For library maintainers, the recommended pattern is to run stubgen-pyx as a step in your release CI and commit the generated .pyi files alongside your compiled extension. Users get full IDE support without generating stubs themselves.
The five-stage pipeline
The full flow from .pyx source to .pyi output:
- Preprocessing - normalize source so Cython's parser reports accurate line numbers
-
AST parsing - feed to Cython's own
parse_from_strings - Visitor analysis - walk the AST collecting functions, classes, enums, imports
-
Conversion - map raw AST nodes to intermediate
PyiElementdataclasses - Building + postprocessing - emit stub text, normalize types, trim imports
Stage 1: Preprocessing
The Cython compiler does not report accurate line numbers for certain node types, necessitating workarounds so that lines can be copied from source to the resulting .pyi file accurately. This matters especially for pulling through function/class decorators and assignment statements. It runs six sequential string transforms, each using Python's tokenize module:
-
replace_tabs_with_spaces- leading tabs to 4 spaces -
remove_comments- strips all comment tokens, replacing with spaces -
collapse_line_continuations- backslash + newline to space -
remove_contained_newlines- removes newlines inside brackets/parens/braces, tracked with a token-based bracket stack -
expand_colons- splitsdef f(): return 0onto proper new lines; only colons that open blocks -
expand_semicolons- same treatment for semicolons
One tricky detail: # type: int comments (PEP 484 style) are extracted before comments are stripped, and their line numbers are adjusted to account for lines removed by the bracket-newline collapsing step.
Stages 2-3: Parsing and visitor analysis
After preprocessing, the source goes to Cython's own parse_from_strings, receiving the same AST a compilation would produce, with all type information intact.
Four visitor classes then walk the AST, all extending Cython's TreeVisitor:
-
ModuleVisitor- top-level entry point, delegates to others -
ScopeVisitor- collects functions, classes, enums, assignments, cdef variables -
ClassVisitor-ScopeVisitorwithin_class=True -
ImportVisitor- collects all import and cimport statements
The key design decision is what gets collected and what gets ignored. ScopeVisitor.visit_CFuncDefNode checks node.declarator.overridable, a boolean that separates cpdef (Python-callable) from pure cdef (invisible to Python importers):
def visit_CFuncDefNode(self, node):
if not node.declarator.overridable:
return node # pure cdef — not Python-visible, drop it
self.cdef_functions.append(node)
return node
Public cdef class attributes (cdef public int count) are also collected since those are exposed to Python callers.
ImportVisitor has one notable special case: it passes through if TYPE_CHECKING: blocks. Any imports your Cython file guards behind TYPE_CHECKING, a common pattern for avoiding circular imports, still get picked up and included in the stub.
Stage 4: Signature extraction and the PyiElement intermediate representation
The Converter class turns each collected AST node into a PyiElement dataclass. This intermediate representation sits between the Cython AST and the output text: it holds a function or class's name, argument list, return type, docstring, and decorators in a neutral form that the postprocessing stages can work with without knowing anything about Cython's AST structure. This boundary is what makes it straightforward to add a new postprocessing pass without touching the parser.
The core of conversion is get_signature, which handles both def and cdef/cpdef nodes. They have different declarator structures in the AST.
For argument types there are two sources. Python-style annotations (def f(x: int)) are read from arg.annotation.string.value. C-style Cython typing (cdef int x) is extracted from arg.base_type via extract_type_from_base_type, which traverses the base type node to reconstruct dotted module paths.
Positional-only and keyword-only markers are preserved: the signature builder emits / and * in the right places based on num_posonly_args and num_kwonly_args from the AST.
Enums get special treatment. A cpdef enum (create_wrapper=True) becomes a proper class with int-typed attributes. A plain cdef enum becomes MyEnum = int, which is accurate since C enums are just integers at the Python boundary.
Stage 5: Postprocessing
The raw generated stub, a collection of PyiElement values serialized to text, goes through several passes, each operating on a Python ast.AST.
Type name normalization
Cython has its own type vocabulary that means nothing to Python type checkers. The normalizer maps every Cython-specific name to its Python equivalent via an ast.NodeTransformer:
| Cython type | Python equivalent |
|---|---|
bint |
bool |
unicode |
str |
void |
None |
char, short, long, int8_t...uint64_t, Py_ssize_t, size_t... |
int |
double, longdouble
|
float |
doublecomplex, floatcomplex... |
complex |
So cpdef uint32_t compute(int64_t x) becomes def compute(x: int) -> int in the stub. All integer widths collapse to int. Callers who care about uint32_t vs int64_t distinctions won't see them, but Python has no equivalent types, so int is the honest annotation.
Import trimming
The stub begins with all imports from the original .pyx, but many will be cimports of C headers with no Python-side meaning. They exist to pull in C type definitions the Cython compiler needs but that don't exist as importable Python modules. The trimmer collects every name actually used in type annotations and signatures, then removes any import that doesn't feed a used name.
The cimport keyword itself is rewritten to import by PyiImport.__post_init__, via a simple regex substitution. A cimport that survives trimming becomes a standard Python import in the stub.
Deduplication and sorting
Identical import statements are merged. Imports are sorted in isort-style order: from __future__ first, then stdlib, then third-party, then local.
What it doesn't handle
Memory views (double[::1], int[:, :]) fall back to untyped annotations in some cases. The base type extraction handles CSimpleBaseTypeNode and basic TemplatedTypeNode, but the multi-dimensional slice syntax is a more complex AST node.
Fused types (ctypedef fused numeric: int | double) have basic support. The stub includes the fused type name as-is.
Getting 95% of signatures correctly typed is more useful than attempting full coverage and producing incorrect stubs.
Architecture
The codebase is separated across six subpackages: parsing, analysis, conversion, models, builders, postprocessing, each with a single responsibility. The PyiElement intermediate representation decouples the Cython AST from the output text, making it easy to add postprocessing passes or extend the type normalization without touching unrelated code.
If this was useful, you can sponsor me on GitHub or buy me a coffee to support my work. Feedback and contributions welcome at github.com/jon-edward/stubgen-pyx
Top comments (0)