DEV Community

Cover image for Excel Is the Most Popular Programming Language, and It's Turing-Complete
Arthur
Arthur

Posted on • Originally published at pickles.news

Excel Is the Most Popular Programming Language, and It's Turing-Complete

The most popular programming language in the world is the one with column headers. By every reasonable count of users, more people write code in Excel formulas than in Python, JavaScript, and SQL combined. This isn't a fun fact about the size of Excel's user base. It's a question about the size of the term programmer.

The developer community has been quietly choosing not to count spreadsheet authors as programmers for thirty years. Microsoft made that choice harder to defend in 2020, and the gatekeeping persisted anyway. Worth asking why.

The reach

Estimates of Excel's global user base run from a billion to a billion and a half, depending on how you count. The most recent numbers from Microsoft put Microsoft 365 active users above 320 million, and the broader Excel install base (unlicensed copies, school deployments, Excel-for-the-web sessions) significantly exceeds that. By comparison, the self-identified developer population is small. Stack Overflow's annual survey reaches tens of thousands of respondents and estimates a global professional-developer population around 28 million.

Order of magnitude, the people writing code in formulas outnumber the people writing code in IDEs by roughly an order of magnitude. The gap is wider in some industries than others. The financial-services sector has run on Excel for decades — bank traders, equity researchers, audit teams, the entire mid-office. Their job is to write programs. They do not call it that. The dev-culture community does not call it that either, and the agreement is convenient for both sides.

Simon Peyton Jones, who spent over twenty years at Microsoft Research and is one of the people most responsible for the modern theory of functional programming languages, has described Excel as the world's most widely used functional programming language. He has also called it a "frustratingly weak" one. Both can be true.

What LAMBDA changed

In December 2020, Microsoft Research announced LAMBDA, a new function in Excel that lets a user define their own functions in pure formula language, with no VBA, no macros, and no escape hatch into a different runtime. The team behind it, led by Andy Gordon and Simon Peyton Jones at Microsoft's Calc Intelligence group, framed the work explicitly as making Excel Turing-complete.

What "Turing-complete" means is worth pausing on; the idea is the premise of this piece. In a 1936 paper, Alan Turing defined a hypothetical machine — a tape, a head that reads and writes symbols, a small set of rules — and argued that this minimal device could perform any computation that could ever be performed mechanically. A language or system is Turing-complete if it can simulate that machine. By a deeply unobvious but well-established result called the Church–Turing thesis, anything Turing's machine can compute is everything that is, in principle, computable. So: a Turing-complete language can express any computation that any other programming language can. Anything you can compute in Python or C or Haskell, you can compute in it.

This is a higher bar than it sounds. Plenty of useful tools fail the test. Regular expressions don't pass it. Neither does basic SQL, HTML, CSS, or JSON. None of these are programming languages, even though people use them productively all day; they are descriptions, queries, or data structures.

The bar is also lower than working programmers usually treat it. Conway's Game of Life is Turing-complete, as Paul Rendell demonstrated by building a working Turing machine inside a Life grid in 2000. Magic: The Gathering is Turing-complete, as a 2019 paper by Churchill, Biderman, and Herrick proved by embedding a Turing machine into the game's rules. The x86 mov instruction, on its own, is Turing-complete, as Stephen Dolan showed in 2013. None of those is a sensible environment for writing payroll software; all of them sit, formally, on the same side of the line as Python.

That line is the entire question this piece is about. Before LAMBDA, you could plausibly argue Excel sat outside the line — short of a programming language, even if it was a powerful spreadsheet. After LAMBDA, that argument is gone. With LAMBDA you can encode the lambda calculus, which is Turing-complete by construction; therefore Excel is. (You can also build it the long way through Rule 110 cellular automata, which Matthew Cook proved Turing-complete in 2004, and which fits comfortably inside an Excel grid.) Excel sits on the same side of the formalism as Python and Haskell. Whether the cultural treatment catches up is the part that's still open. Peyton Jones put it plainly in 2021: "you could really write literally any program in Excel now."

The product update was significant. The same team also shipped MAP, REDUCE, SCAN, MAKEARRAY, BYROW, and BYCOL — higher-order functions of exactly the kind a Haskell programmer would recognize, only addressed by cell reference instead of identifier.

Why the gatekeeping persisted anyway

The cultural markers that make a piece of work feel like programming (terminals, syntax highlighting, version control, conferences with hoodies) do not apply to Excel. The IDE is a cell grid. Source control for spreadsheets is between bad and nonexistent. Code review for an .xlsx file is not an established practice; you cannot meaningfully diff two spreadsheets the way you can diff two source files. The community signals that say "this is engineering" do not fire on a .xlsx extension.

So the work doesn't count, even though the financial-services industry has been silently shipping critical Excel code for a long time and some of the casualties are documented. The most expensive single example, JPMorgan's 2012 "London Whale" loss, traces in part to a value-at-risk model where the spreadsheet divided by the sum of two hazard rates instead of their average. The understatement of risk allowed the underlying trade to grow unchecked. The total loss came in around $6.2 billion. The formula that did it was one cell.

The genetics field gave up trying to keep Excel from corrupting their data files and renamed 27 human genes in 2020 — MARCH1 becoming MARCHF1, SEPT1 becoming SEPTIN1 — because Excel auto-converted the original symbols to dates. Nobody renamed Excel.

These are not signs that Excel users are bad programmers. They are signs that Excel is a programming environment without the tooling we have built up around the practice of software engineering. The work was always engineering. The infrastructure around the work (the linters, the tests, the review, the pull requests) went elsewhere.

The asymmetry of the AI moment

There's a tell in the present moment that's worth naming. "Writing code by typing English at a chatbot" is now widely accepted as a kind of programming. "Vibe-coding" is a recognized verb in 2026 dev culture.

Writing code by typing formulas into spreadsheet cells is not similarly accepted, even though both produce executable behavior, both require a non-trivial mental model of how the underlying system evaluates the input, and both are routinely used to ship work that affects real outcomes. The cultural status of the input syntax is the only thing that differs. The industry has decided which kinds of non-text-editor programming count and which don't, on grounds it does not articulate.

The boundary was never technical

The question of who counts as a programmer was never a technical question. It was a social one: about credentials, about tooling, about which kinds of work the industry is willing to call engineering and which it isn't. LAMBDA didn't make spreadsheet authors into programmers. It removed the last argument that they weren't.

The financial analyst whose model moves a billion dollars on a Tuesday morning is doing exactly what the senior engineer at a software company is doing on a Wednesday afternoon. One of them gets called an engineer; the other one gets called an analyst, or a finance person, or a "power user," or some other word that locates the work outside the discipline. That distinction has done more to limit who learns to think computationally — who feels welcome at the conference, who applies for the job, who gets the title and the pay band — than any technical barrier ever did.

It costs the dev-culture community very little to widen the term. The cost of keeping it narrow is paid mostly by the people the community has been declining to count. After 2020, the technical case for the line is gone. What's left is the social case. That part is on us.

Top comments (0)