DEV Community

Cover image for I Put a Neural Network in a Thermometer — Then It Got Out of Hand
Alex Rosito
Alex Rosito

Posted on

I Put a Neural Network in a Thermometer — Then It Got Out of Hand

How it all started

About three years ago my wife asked me: "Honey, what do you think the temperature is today?"

I told her to check her phone. She gave me a look and said, "No — in here. In the apartment."

The next day I remembered we had a clock with a built-in thermometer somewhere. Found it, put batteries in it, problem solved. But I couldn't stop thinking: how do digital thermometers actually work? Temperature is continuous — how does a microcontroller read it?

That question sent me down a rabbit hole. I learned about thermistors. Then about Wheatstone bridges. Then about instrumentation amplifiers. A few weeks later I had a thermometer built around a thermistor, an AD620, and an ATtiny85 — a chip with 8KB of flash and no business running anything sophisticated.

The ATtiny85 reads the output of the analog front end, normalizes it (Vout/Vcc), and maps it to temperature. The thermistor follows the Steinhart-Hart equation, which is nonlinear. I modeled the full hardware system mathematically, generated synthetic data from the thermistor's datasheet, and fit a third-degree polynomial to the transfer curve. Good correlation. The thermometer worked well for years.


Fast forward to a stretch with no projects on the bench. I decided to shake off some rust and look into AI. I started with theory, implemented a perceptron from scratch — I still have the perceptron.cpp from when this started — and kept going. What started as an academic exercise, understanding the mechanics behind neural networks, writing the code from scratch, ended up producing something I hadn't planned for: a tool. And once I had the tool, I remembered the thermometer.

What if I replaced the polynomial correction function with a neural network? I trained a small network on the same synthetic dataset, exported a C header, dropped it into the ATtiny85 firmware, and replaced the three-line polynomial with a single predict() call.

It worked.

What struck me wasn't just that it worked — it was what it meant. I had just run inference on a microcontroller that was never designed for it. An ATtiny85, sitting at the bottom of a parts drawer, running AI.

That experiment didn't stay small for long.


Over the following months it evolved into something more deliberate. I needed something that would train on the desktop and export a completely self-contained C header — no runtime, no framework, no dependencies on the other side. Just weights and a predict() function that compiles with any C99 toolchain.

#include "model.h"

float input[2]  = { 1.0f, 0.0f };
float output[1];

predict(input, output);
Enter fullscreen mode Exit fullscreen mode

That's the entire deployment story. The firmware doesn't know Hasaki exists.

I called it Hasaki — 刃先 in Japanese, meaning the cutting edge of a blade. Because it runs on edge hardware. And because it's designed to be sharp and minimal.


What Hasaki does

Hasaki trains fully-connected feedforward networks and exports them as standalone C headers. Float, INT8, or INT4 quantization. The workflow is three commands:

# Train
hasaki -d 2,4,1 -act sigmoid,sigmoid -a train \
       -f examples/xor.csv -e 500 -l 0.1 -o model.txt

# Validate
hasaki -d 2,4,1 -act sigmoid,sigmoid -a validate \
       -f examples/xor.csv -m model.txt

# Export
hasaki -d 2,4,1 -act sigmoid,sigmoid -a export \
       -m model.txt -o model.h -q float
Enter fullscreen mode Exit fullscreen mode

No cloud. No SDK account. No subscription. If it compiles for your target, it runs.

The free edition handles up to 2 hidden layers, 64 neurons per layer, and float export — enough for sensor correction, simple classifiers, and prototyping. The Pro edition removes those limits and adds Adam optimizer, dropout, L2 regularisation, and INT8/INT4 quantization for tighter memory budgets.


The MNIST demo

To test whether the claim held up beyond a thermistor, I trained a 784→64→10 network on MNIST and deployed it on an ESP32-C3 Super Mini — one of the smallest, cheapest MCUs available. The exported INT8 header is 212KB. The ESP32-C3 has 400KB of SRAM. It fits.

No TensorFlow Lite. No ML runtime of any kind. The ESP32-C3 serves a web canvas where you draw a digit, and it classifies it — inference running entirely on the chip, from a C header.

The full project is at: https://github.com/AlexRosito67/hasaki-mnist-esp32


What it isn't

Hasaki handles fully-connected feedforward networks. No convolutions, no attention, no recurrent layers. If your problem needs those, this is the wrong tool.

It's also not a research platform. The quantization is static. The training is solid for small networks. For serious production deployments with certification requirements, you need TFLite Micro, Edge Impulse, or NanoEdge AI.

But for the microcontrollers at the bottom of the drawer — the ones that were never meant to run inference, are available by the thousands, and cost less than a cup of coffee — Hasaki is a reasonable place to start.

All of this just because my wife just wanted to know the temperature in the apartment.


Hasaki Free is on GitHub: https://github.com/AlexRosito67/hasaki

Top comments (0)