Ben Santora

Posted on Jan 12 • Edited on Jan 29 • Originally published at ben-santora.github.io

Is an AI Model Software? – A Low‑Level Technical View

#ai #architecture #discuss #software

I've been posting articles here on dev.to recently, documenting my experiences while testing small and large language models. I'm an engineering tech, not a professional software developer. Since dev.to is a platform for those in the software field, it got me thinking about whether AI is really software or not and whether I should be posting my AI articles here.

AI is so intertwined with software these days that it seems like an odd question. When we speak casually, we answer “yes”: large‑language models (LLMs) and small‑language models (SLMs) are built by engineers, shipped through software pipelines, and are run by programs. Yet at the level of a systems programmer, compiler writer, or hardware designer, the question is far from trivial. What exactly is an AI model, what does it contain, and does it satisfy the technical definition of software?

For the purpose of hardware and systems design, software is executable logic—a sequence of instructions (machine code, bytecode, or interpreted source) that a processor can execute. It embodies control flow: branches, loops, calls, and returns. Anything that resolves to an instruction stream that a CPU or accelerator can run qualifies as software. Conversely, a file that merely stores data, even if it is bundled with an application, does not truly meet this definition.

In the purest sense, a trained small or large language model is typically distributed as a file with extensions such as .safetensors, .gguf, or .pth. Inside are large multidimensional arrays of numbers—weights and biases learned during training. These numbers parameterize a fixed mathematical function: they tell a neuron how strongly to influence another, how to weight a feature, and how signals propagate through layers.

Crucially, the model file contains no control flow. There are no conditionals, loops, or instructions that say “if X then Y.” It is not an algorithm; it is a parameterization of an algorithm that lives elsewhere. Formats like safetensors are deliberately designed to store only raw data and metadata, explicitly forbidding embedded executable code to prevent remote‑code‑execution attacks. This design choice underscores the fact that models are intended to be inert data, not executable artifacts.

Let's ask whether a model can be executed directly. A CPU cannot interpret a .gguf file; a GPU cannot run it without a driver; you cannot make the file executable (chmod +x) and launch it. To produce output, the model must be loaded into an inference engine—software written in C++, Python, Rust, etc. that knows the model’s architecture, performs tensor operations, schedules work, and handles memory.

All the logic that multiplies matrices, applies activation functions, and manages caches lives in this runtime, not in the model. The same model file can behave in dramatically different ways depending on the runtime, hardware, precision, or quantization scheme used. This dependency draws a clear line between data (the model) and the software (the inference engine) running that model.

Neural networks blur the classic boundary between data and code. In conventional programs, behavior is encoded explicitly in conditionals and loops. In a neural net, behavior is encoded implicitly in numerical weights: tweaking millions of numbers can change the system’s output in the same way software output can be changed by rewriting thousands of lines of code.

Nevertheless, the weights describe what values to use, not how to compute them. The algorithm—the “how”—is fixed and external; the weights are merely coefficients inside that algorithm. That is why two different runtimes can load the same model and still produce identical results while employing completely different execution strategies. The software determines the execution; the model supplies the parameters.

* Note: Reviewing this article today, I realized that it could be argued that weights ARE the logic, just represented non-linearly - i.e - while the weights are mathematically passive, they are able to functionally replace the 'if/else' branches of traditional software.

Much of the association of these models as being software stems from one of their most commonly used applications - as developer‑assistant tools like Claude Opus, Qwen or Copilot. But generating source code is just one application of a general‑purpose statistical model. Whether a model writes Python, translates languages, predicts protein structures, or classifies images does not alter its internal structure. A model that outputs code is no more “software” than a CSV file that contains code snippets.

Take a model file and compute its checksum—leave every byte untouched. Now change only the surrounding stack: swap PyTorch for llama.cpp, move from CUDA to CPU, quantize from fp32 to int4, or switch from AVX2 to AVX‑512. The model remains identical, yet latency, memory usage, and even numerical results can vary by orders of magnitude. The only thing that changed is the executable logic, confirming that the model itself is NOT software.

In practice, models are versioned, distributed, cached, deployed, and rolled back just like any other software component. They live in repositories, have compatibility constraints, and are monitored for regressions. But they are not software.

An AI model, in isolation, is data—a trained numerical artifact that encodes the parameters of a mathematical function. It contains no executable logic, control flow, or instructions. Only when an inference engine (software) interprets those numbers does the model become part of a software system.

This distinction matters for correctness, security, auditing, and formal reasoning. It reminds us that modern AI does not replace algorithms with magic; it replaces hand‑written rules with learned parameters that are still evaluated by traditional code.

So, having come up with the question myself, I came to the conclusion that no, an SLM or LLM is not software; it is a trained set of numbers that becomes part of a software system only when interpreted by executable code.

So thanks to Jess and company for letting me post a non-software article here! Hope you found this interesting.

Ben Santora - January 2026

Top comments (18)

Doug Wilson • Jan 21 • Edited

Thank you, thank you, THANK YOU! Such a key distinction.

I've been wrestling with this, and your article has provided the answer I was seeking. I literally have a Lucidchart diagram with a database symbol representing a model next to a big question mark, i.e. what the heck goes here?! Something must, but I didn't have the vocabulary.

"Inference Engine" made my freaking week. THANK YOU!!!

I'd certainly welcome more like this if you have more insights to share.

Ben Santora • Jan 21 • Edited

" . . . a Lucidchart diagram with a database symbol representing a model next to a big question mark, i.e. what the heck goes here?!" Haha - thanks, Doug! I was hoping the article wasn't just about language - ie - definitions and semantics and would actually reveal something useful about what's going on low-level. Researching it taught me a lot as well - it's a new technology, after all.

Doug Wilson • Jan 21

It certainly is. Your explanation really helped.

Luis Eduardo Colon • Jan 20

Good article. I could argue that a more accurate classification is that SLM/LLMs are not standalone software but perhaps represent a parameterized algorithm.

As you mentioned, models are indeed versioned, distributed, cached, deployed and rolled back - much like database records, documents, and other assets. Further, models are also written (whether the writing is performed by generation from another program), and are tested. You can also argue that as the model or matrix represents weights, these can also be represented or translated to decisions in a tree or in a graph, even a flowchart.

In that sense, you can argue that models are possibly a representation of an algorithm...and therein lies the difference. An algorithm becomes standalone software when implemented as a complete and executable set of instructions and/or declarations in a particular language.

Finally, in the context of value as part of a software system or architecture, it could be argued that models as components of said system can be as valuable, or perhaps more valuable, than the workers/runners/executors of the model.

Ben Santora • Jan 20 • Edited

Thanks. I agree that an LLM can be described as a a trained parameter set for a fixed algorithm, though not an implementation of one. The executable algorithm lives entirely in the runtime; the model merely supplies coefficients. The inference engine multiplies inputs by weights, adds biases, and applies simple functions like squashing or clipping values. No decisions are made by the model itself; all decisions about order, loops, and execution come from the engine. At higher levels of abstraction, both can feel like “logic” because changing weights changes behavior, much like changing code does. Still, from a systems perspective, behavior encoded as data is still data.

Joe • Jan 20

This seems quite similar to the relatively new concept of intents in cryptocurrencies. They specify the result that a user wants, not how to achieve it. They're not software, but submitting one produces a resulting set of actions that is often not strictly deterministic.

Ben Santora • Jan 20

I see what you mean - both describe a desired or learned outcome space, but neither contains the steps to make it happen. The variability and non-determinism come from the executors, not from the artifact itself.

CapeStart • Jan 16

I’m kinda with you that the weights file is basically data, and the runtime is the actual software. But in practice it feels like firmware vibes, because swapping the weights changes behavior like changing code would. Might be worth calling it a software artifact in a broader sense, but yeah, at the low level it’s not executable on its own.

Ben Santora • Jan 16

I'm an engineering tech, not a software dev myself. I agree with your insight - at an operational level, swapping weights does seem to compare to swapping code or firmware, because the system’s behavior changes dramatically without touching the surrounding stack.

These new ai models are hard to define - they aren’t software in the classic executable sense, but they also aren’t passive data in the way config files or datasets are. They're in a kind of middle space - their behavior encoded numerically rather than as control flow and instructions.

Marry Walker • Jan 15

"Great perspective! It’s easy to confuse AI models with software, especially since they’re used so much in coding and development. I really like how you broke down the difference between data and software. This distinction is super important for discussions around security, scalability and deployment.

Ben Santora • Jan 16

Thanks, Marry!

Vidpop • Jan 17

Great Article!

Peter Koves • Jan 20

branches, loops, calls, and returns
Logic programs don't have these (tough a predicate assertion is sometimes interpreted as a call). Pure SQL doesn't have these. XSLT doesn't have these. Haskell doesn't have these.

So none of the multitude of programs in these languages are software, right?

Ben Santora • Jan 21 • Edited

I'm an engineering tech, not a software developer so I can't really debate at this level. I do think that my article stops generating any insight as the discussion shifts from systems behavior to definitions and edge cases in language theory.

Doug Wilson • Jan 21

SQL is a language (Structured Query Language), not software. SQL, e.g. SELECT * FROM ORDERS WHERE Total > 100, is interpreted and executed by a database management system (DBMS), which is software.

Similarly, XSLT is a transformation language that requires a processor (software) to transform XML documents into other formats, like HTML, ASCII text, etc.

Languages are not software; they are the syntax for expressing instructions for how to create new software or execute existing software.

Peter Koves • Jan 21

Any application consists of a program and data (parts frozen into the code, parts loaded). An LLM is an abbreviation for the data part, the required execution engine is implied. Sloppy terminology if you like.

I obviously meant the programs expressed with the languages I mentioned and their execution engines. Sloppy terminology if you like.

So, is this discussion about the distinction between the initial tape of a Turing Machine (not software?) or the TM executing it?

Ben Santora • Jan 21

Peter - I think you're still making it about language categories and definition. My article wasn’t about whether AI models can be forced into the 'category' of software, but rather whether treating them as if they were ordinary software leads to incorrect assumptions when trying to understand and interact with them.

Peter Koves • Jan 21 • Edited

Conversely, a file that merely stores data, even if it is bundled with an application, does not truly meet this definition.
This is where we disagree. As I noted, part of the data is embedded in the program, other parts are in files, DBs, read as input. Going back to fundamentals... just like TM and its tape.
I agree that treating LLMs as just software is a mistake. But so is doing that with operating systems, etc. The issue is that they have such a level of complexity that while we understand the principles of how they operate, we no longer understand it in detail.

So we should leave it at that.

BTW, your "counterexamle" of the LLM file not being executable is false. On Windows any file can be run as a program by a suitable one time association with the program that runs it. On Unix-like systems properly implemented execution engines will skip a first line that contains "#! " and after chmod +x the shell will run it. Complain to your LLM engine developer that their implementation is not complete. But on Linux, even lacking that it can be done via the /proc/sys/fs/binfmt_misc/register mechanism... that's how java jar files can be run as commands by default on most Linux distributions.
Correction: The above about jar files is no longer true on modern Linux, but similar is true about .python files (without hash bang).

View full discussion (18 comments)