Grant Riordan

Posted on Apr 3

Stop Writing Python — Let C Do It

#python #discuss #development

April 2026 · 6 min read* A .NET developer's guide to Python performance — and why the rules are different

I've spent most of my career in .NET. C#, the CLR, JIT compilation — these are things I know deeply. I'm proficient in TypeScript and JavaScript for full-stack and mobile work. Python, though, has always been at arm's length. I never really needed to cross over into that world.

That changed recently. And the first real thing it taught me was something I hadn't expected: don't write your own raw code if you can help it.

That's strange advice if you're coming from C#. In .NET, hand-crafted code and library code run through the same JIT compiler. Performance is often comparable, so you usually optimise based on other considerations — readability, semantics, maintainability. Python, it turns out, works very differently. Understanding why changes how you write it.

The fundamental difference isn’t syntax — it’s where computation happens. In C#, you trust the JIT to make your code fast. In Python, your job is to keep work out of the Python layer as much as possible.

How Python Actually Runs

The version of Python most people use is CPython — the reference implementation, written in C. When you run a .py file, CPython compiles it to bytecode and then interprets that bytecode at runtime. Crucially, it does not use JIT compilation like C# does.

C# runs on the .NET Common Language Runtime, which uses Just-In-Time (JIT) compilation. Your C# source gets compiled to Intermediate Language (IL), and when you run the program the CLR compiles that IL to native machine code, optimised at runtime. After the initial warm-up, it executes at hardware speed.

CPython is interpreting Python bytecode one instruction at a time. The .NET CLR is handing the CPU something close to its native language.

Worth noting: Python 3.11+ introduced an adaptive interpreter with specialised bytecode — certain hot code paths get micro-optimised at runtime. It's a meaningful improvement, but CPython still lacks a general-purpose JIT. The gap with C# remains significant for CPU-bound work.

This is the root of why Python's raw execution speed is generally slower than C# for CPU-intensive tasks.

The C-Backed Library Trick

Here's where Python gets clever. Many of its built-in functions and popular libraries — sum(), sorted(), itertools (a lazy iterator toolkit), the entirety of NumPy — are not written in Python. They're implemented in C, compiled to native machine code, and called from Python. When you use them, you're escaping the interpreter overhead for the heavy work.

# Slow: a plain Python loop
total = 0
for x in my_list:
    total += x

# Fast: drops into C for the iteration
total = sum(my_list)

A very simple example, but the point holds at scale. Both do the same thing. But the second hands off to a C function that runs without the interpreter touching each iteration. For large lists, the difference is measurable.

The loop isn’t the problem — the Python interpreter touching every iteration is. The fix isn’t to avoid loops specifically, it’s to push computation down into C-backed code wherever possible.

In C#, this asymmetry largely doesn't exist. A hand-rolled loop and Enumerable.Sum() from LINQ go through the same JIT pipeline. That said, LINQ isn't entirely free — it can introduce allocations, delegate overhead, and less predictable inlining. A for loop over an array or Span<T> is often faster in hot paths. But this is an optimisation consideration, not a fundamental runtime difference the way it is in Python.

Vectorisation: The Real Unlock

The "use library functions" advice only scratches the surface. The deeper insight is vectorisation — and it's where the real performance leap happens. The loop example illustrates the principle, but vectorisation is where it gets truly powerful. Not just “C code instead of Python code”, but entire operations that never touch the Python layer at all.

Libraries like NumPy don't just run in C. They operate on entire arrays at once, using CPU-level Single Instruction Multiple Data (SIMD) instructions, avoiding Python-level loops entirely.

import numpy as np

a = np.array([1.0, 2.0, 3.0, 4.0])
b = np.array([10.0, 20.0, 30.0, 40.0])

# This doesn't loop in Python — it's a single C-level vector operation
result = a * b

Compare that to:

# Each iteration is a Python instruction — slow
result = [x * y for x, y in zip(a, b)]

Even though both produce the same output, the NumPy version isn't just "faster C code" — it's a fundamentally different execution model. The Python interpreter barely participates. This is the insight that explains why Python dominates data science and scientific computing despite being a slow language at its core.

Where Does JavaScript Sit?

Coming from .NET, you might assume JavaScript is similarly slow — a scripting language, interpreted, running in a browser. Historically that was fair. Modern JavaScript engines have largely changed the picture.

V8 (powering Chrome and Node.js) uses aggressive JIT compilation with multiple optimisation tiers. JavaScript code that runs frequently gets compiled to native machine code, similar in principle to the .NET CLR. TypeScript compiles to plain JavaScript before anything runs, so at execution time it is identical to JS — no performance difference there.

It's worth noting that JS JIT and C# JIT behave differently in practice. V8 uses speculative optimisation — it makes assumptions about your code and optimises aggressively, but can deoptimise if those assumptions are violated. C#'s CLR is more predictable. Both are fast; JS just requires more care to keep on the happy path.

The practical upshot: the "use library functions or your code will be slow" advice matters much less in JavaScript and TypeScript. A hand-written for loop in modern JS is not dramatically slower than an equivalent library call, because both go through the same JIT compiler.

Where JS does lag is in areas Python has addressed with C-backed scientific libraries. NumPy, Pandas, and similar tools have no real equivalent in the JavaScript ecosystem — or at least, nothing as mature or widely adopted. TensorFlow.js and WebAssembly-based libraries exist, but the gap is significant. If you need serious numerical computing, Python + NumPy will outperform vanilla JS even accounting for Python's slower interpreter, because the actual number crunching never touches the Python layer at all.

A Quick Comparison

Language	Execution Model	Raw Loop Speed	"Use Libraries" Advice
C#	JIT → native machine code (CLR)	Fast	For ergonomics, not speed
Python	Interpreted bytecode (CPython)	Slow	Critical for performance
JavaScript	JIT → native machine code (V8)	Good	For ergonomics, not speed
TypeScript	Compiles to JS → same as JS	Good	Same as JS

Simplified for general application code. Performance is context-dependent. PyPy — a Python implementation with JIT compilation — can significantly close the gap with C# for CPU-bound workloads.

What This Changes About How I Write Python

Knowing this shapes a few concrete habits.

Reach for built-ins first. sum(), map(), sorted(), Counter, list comprehensions — these exist not just for readability, but because they hand off work to C.

NumPy before hand-rolled loops. Any numerical work over arrays belongs in NumPy. Not because it's cleaner (it is), but because it vectorises the operation and keeps it out of the Python interpreter entirely.

Don't be surprised by slow loops. When Python feels slow in a tight loop, that's not a bug in your code — it's the nature of the interpreter. The solution is usually to restructure so the interpreter handles coordination while optimised C code handles the heavy lifting.

For a .NET developer, this is a genuine mental shift. In C#, you trust the JIT to make most sensible code fast. In Python, you trust the libraries to provide fast tools — and your job is to wire them together well.

The Bigger Picture

Python's slowness in raw execution isn't a flaw so much as a design trade-off. The language prioritises readability and developer speed, and it compensates for interpreter overhead by sitting on top of an enormous library of high-performance C code. Once you understand that, a lot of Python idioms that might seem arbitrary start to make perfect sense.

Sometimes you genuinely can’t offload to a library — custom logic, complex conditionals, things NumPy can’t express cleanly. In those cases, accept the performance tradeoff, or reach for generators to manage memory overhead, or PyPy if raw speed is required.

Coming from C# and JavaScript, the biggest adjustment isn’t syntax or semantics. It’s accepting that Python’s performance model is fundamentally different — and that working with it, rather than against it, means trusting the libraries over your own raw code. In C# the JIT has your back. In Python, the libraries do.

Written with a .NET background, a Python curiosity, and one too many slow for loops.

Have thoughts? Leave a comment or find me on X/Twitter.

DEV Community