Alex Rosito

Posted on May 20 • Edited on Jul 11

I ran MNIST on an ESP32-C3 without TensorFlow, TFLite, or any ML runtime

#cpp #embeddedsystems #machinelearning #ai

I ran MNIST digit recognition on an ESP32-C3 — without TensorFlow, TFLite, or any ML runtime.
The neural network is compiled directly into a C header and executed as firmware.

📢 Repository Update

Note: The repository has been updated to scale up the network's capacity. I changed the architecture from the original 784 → 64 → 10 topology to a more robust 784 → 128 → 10 structure.

Why this change?

Better Accuracy: Duplicating the hidden layer units from 64 to 128 expands the mathematical capacity of the network, allowing Hasaki to capture finer details and strokes of the MNIST digits.
The Ultimate Stress Test: Moving more than double the synaptical weights means significantly more matrix multiplications and a larger static memory footprint. I wanted to push the bare-metal C++ pipeline further.

The Outcome

The ESP32-C3 swallowed the 128 hidden units without breaking a sweat. Dynamic allocation overhead remains at absolute zero (malloc/free are banned from the prediction cycle), and inference execution is just as snappy and stable as before.

Model Specifications & Footprint

Here are the factual metrics of the new network structure deployed directly onto the hardware:

Model Configuration

Parameter	Value
Framework	Trained with Hasaki v3.2.1
Architecture	784 → 128 → 10
Activations	ReLU + Softmax
Optimizer	Adam + Dropout 0.3 + L2 0.0001
Quantization	INT8
Training set	47,999 samples (MNIST)
Validation accuracy	~98.2%
Header size	399.8 KB

Memory Footprint (ESP32-C3, INT8, 784-128-10)

Resource	Used	Available	Usage
RAM	40,108 B	327,680 B	12.2%
Flash	1,034,610 B	1,310,720 B	78.9%

> Note: Flash usage includes the full sketch plus the static INT8 model header.

From Perceptrons to a Cross-Platform NN CLI for Edge Inference

It started, like many questionable engineering projects, with curiosity about a perceptron.

I wasn’t trying to build anything serious. Just a small experiment to understand how far I could push a minimal neural network implementation without relying on any framework.

That experiment didn’t stay small for long.

Once the basic forward/backward pass worked, I started wondering whether the same approach could be useful outside of toy problems. Specifically, I had recently built a ratiometric thermometer system, and it quickly became clear that the analog measurements were noisy and sensitive to supply variation.

That’s where the idea clicked: instead of fitting everything with fixed equations, what if I used a lightweight neural network as an inference layer for correction?

Not training in the field — just inference on constrained hardware.

Building a Minimal Neural Network Toolchain

That idea turned into a CLI tool, not a library. Not a framework. A command-line tool.

The goal was simple:

Define a network architecture
Train it on a dataset
Export a standalone model
Run inference anywhere

No heavy dependencies. No runtime frameworks required on the target side, just portable inference.

Over time, this evolved into a cross-platform tool supporting Linux, macOS, and Windows, with a fully deterministic inference pipeline.

The Unexpected Benchmark: MNIST

At some point I needed a sanity check, so I ran MNIST through it.

~59,999 training samples~1,000 test samples

Nothing exotic — just a standard benchmark to validate the pipeline end-to-end.

The interesting part wasn’t accuracy. It was stability.

Training, export, and inference all behaved consistently across platforms. No drift between environments, no dependency on external runtimes, and no hidden assumptions about floating-point behavior beyond what was explicitly defined.

For a homegrown stack, that was the real milestone.

Why This Matters for Edge Systems

The original motivation was never image classification, it was edge inference under constraints:

limited memory
limited compute
unstable or variable power conditions
need for deterministic behavior

In that context, most modern ML stacks are overkill. Even lightweight frameworks often assume too much infrastructure.

So the focus shifted toward something more specific:

A minimal neural network runtime that could be compiled down and deployed without dependencies.

What the Tool Actually Is

Today, the project is essentially three things:

A minimal neural network training engine
A model exporter that produces standalone inference graphs
A CLI runtime for cross-platform execution

It is not meant to compete with full frameworks.

It is meant to answer a narrower question:

How small can a usable neural inference stack get while remaining practical?

Repo

GitHub
Codeberg

Want to train your own models?

Let your creativity flow and make your own projects.

→ Get Hasaki 刃先 Free: GitHub - Codeberg

Final thought

This started as an experiment to understand perceptrons and a thermometer, it ended as a small toolchain for compiling neural networks into embedded firmware.

And I'm still not entirely sure where the boundary between "software" and "firmware" is anymore.

DEV Community