Bashar Hasan

Posted on Jun 2

My First Embedded System with a 40-Year-Old Machine

#arduino #cpp #iot

What happened?

Recently, I got an unusual task: bring an old medical machine back to life.

The machine had been built more than 40 years ago (older than me 🙃) and its original control mechanism was completely mechanical. Unfortunately, that mechanism had been seriously damaged, and replacing it wasn't really an option.

So the solution was to redesign the controller from scratch using an Arduino.

That’s where my embedded systems journey started...

The task...

The device itself was pretty unusual.

Its job was to move tissue samples through a sequence of chemical tanks, where each tank contained a different substance used during processing.

Our machine had 12 tanks in total.

The first 10 tanks required the tissue to remain submerged for one hour each. The last two tanks were different: they contained paraffin wax, so the samples had to stay there for two hours.

Those final stages were especially sensitive because the paraffin had to be completely melted before the tissue could enter the tank. Otherwise, the sample could be damaged.

The original machine already had:

wax temperature sensors for the paraffin tanks
two heating elements
vibration motor that used to shake the samples while they were inside the tanks.
a bottom sensor to detect when the sample holder was fully submerged
a top sensor to detect when the mechanism was preparing to move the samples to the next tank
a motor controlling the movement mechanism

The movement itself was surprisingly primitive.

The main motor only knew how to perform a fixed sequence:
down → up → move → down → up...

There was no position control, no direction switching, and no speed adjustment. We could only turn the motor on or off and trust the mechanical system to complete the cycle correctly.

At first glance, it sounded simple.

Then came the real problems:

handling power outages
detecting damaged sensors
recovering from interrupted cycles
allowing operators to skip tanks safely
preventing samples from being destroyed by overheated or unmelted paraffin

And this wasn’t some hobby project.

It was a medical device.

If my code crashed, a patient could lose a tissue sample that had already gone through an entire week of processing across multiple machines 🤐

The Hardest Part... Defining the Task!

When I was less experienced, I often heard that programming is only a small percentage of a software developer's job, and that the biggest time sink is defining the actual task with stakeholders.

But why?

Because clients themselves often don't fully realize what they want. They say things like:

"Make it work on time, without bugs, and make it easy to maintain!"

Okay, but sometimes the client is asking for the wrong thing from an engineering perspective, and sometimes we, as developers, overcomplicate things and forget about the user experience.

In this project, I was part of a team of three engineers. I was the software developer, the second was an electrical engineer (I spent most of my time discussing technical nuances with him), and the third was a mechanical engineer.

Oh, how many times I heard:

"Keep it simple. Just remove that. It's easy. Ignore that sensor..."

Sometimes I defended those decisions, and sometimes I simply noted the possible consequences of a particular implementation.

After many meetings and countless hours spent searching the internet and asking LLMs for reliable descriptions of how these machines actually worked, the technical requirements were finally ready... 😊😯

Implementation... First Round :)

Here we go. I implemented the code using an FSM (Finite State Machine), and everything seemed to be working well.

Then I heard the electrical engineer say:

"Okay, we think it should handle power outages too..."

If you could have seen my face ):

The funny part was that I had asked several times about handling power loss, and the answer was always:

"We don’t need that. The power supply is backed by a UPS, so we don’t expect any interruptions."

Even more interesting, they were against adding EEPROM, which I considered necessary because I wanted to store the current position and state.

They said:

"No, no, that will make everything too complicated."

After a long discussion, I finally said:

"Okay, but I'll list the drawbacks of this solution. We'll rely heavily on sensors, and the total downtime of the tank liquid process will be unknown after a restart."

And the response was:

"No, no, everything will be fine."

Rewriting the Code...

Did I really rewrite everything?

No.

Because I implemented the system as an FSM (Finite State Machine). In simple terms, every state can only transition to specific other states and cannot randomly jump between them.

It's a bit like real life.

You can't be asleep and awake at the same time.
You can't be running a marathon while sitting perfectly still.

These are mutually exclusive states. If you somehow find yourself doing both, you’re either dreaming or you probably need to see a doctor. 😊

In our case, if the machine ends up in an impossible state, the customer should contact us or call a technician.
And that's why I didn't have to rewrite everything.

Because the machine was built around isolated states, I could simply add a recovery mode. If the sample holder was down, everything was normal and the machine continued as usual. If it was up or somewhere in between, the software tried to figure out its position and recover safely from there.

One more point for FSMs. 😄

Example of My Mess...)

Let's look at one of the methods responsible for moving the samples to the next tank:

// Finite State Machine Transition Configuration
{
    verifyingPredicate,              // Condition check
    S_IDLE,                          // Next state if false
    S_UNKNOWN_DIRECTION_RECOVERY,    // Next state if true
    verifyingProcess,                // State body logic
    verifyingActionChanged,          // UI/Side-effects hook
    VERIFICATION_DELAY_MS,           // State stabilization timeout
    PREDIC_TIMER                     // Timer configuration
},

First, it’s an array of transitions, so each transition includes the previous one. Don’t ask me how I ended up with this architecture—it actually comes from the Finite State library.

The predicate runs every cycle of the main loop. It decides where we go next: if it returns true, we move to S_UNKNOWN, otherwise we go back to S_IDLE.

verifyingProcess runs after the predicate decision. It’s basically the body of the state — all the actual work we want to do while we're in it. Nothing fancy, just the state logic itself.

Then we have verifyingActionChanged. This one is triggered when we enter or exit the state, so it’s perfect for side effects like UI updates. In this case, we only print a message on the LCD when we enter the state.

VERIFICATION_DELAY_MS is just a timing parameter (around 10 seconds here). And PREDIC_TIMER tells the system to temporarily ignore the predicate for that period — so we don’t switch states immediately. That delay gives the system time to check the tank ID and stabilize all sensors before making a decision.

bool verifyingPredicate(id_t id)
{
    if (bottomLimit.isActive())
        return false;
    return true;
}
void verifyingProcess(id_t id)
{
    syncTankID(true);
}

void verifyingActionChanged(EventArgs e)
{
    if (e.action == ENTRY)
        lcdShowStatus(F("Initializing"), F("Wait 10 seconds"));
}

At first glance, it looks pretty simple, but there’s a lot going on here.

The predicate is basically the decision point: if the bottom limit switch is active, we assume the sample holder is in the correct position and continue normal flow. If not, we drop into recovery mode and try to figure out what actually happened.

The process step syncs the internal tank ID, making sure the software state matches the physical position of the machine.

And finally, verifyingActionChanged() is just a small UI hook — when we enter this state, we show a status message on the LCD so the operator knows the system is doing a verification step.

It doesn’t look like much, but this is basically how the whole system is built: small states, each doing one thing, and a lot of safety logic hidden in simple checks like this.

Final Implementation... Or Not So Final?

I still remember testing that code on the real machine.

Fortunately, I had implemented two modes:

Normal mode, where the machine completed its cycle in about 14 hours.
Test mode, where the same cycle took only 3 minutes.

That allowed us to run many complete tests within a single hour.

And that's when we encountered electrical noise.

The processor would stop, restart, and occasionally get stuck in a loop.

Luckily, I had already implemented a watchdog timer (basically, if the software gets stuck for more than two seconds, it automatically restarts).

The electrical engineer said he fixed the noise issue, so... we'll see. :)

We spent a huge amount of time testing different situations and edge cases:

Processor timeouts
Thermostats
Sensor failures
Tank identification
Recovery scenarios

As with most embedded development, and especially in medical-related systems, testing often takes more time than development itself.

The real world is much less predictable than the development environment.

So, in the end, I can say that I really enjoyed working on this project with the team. If I sounded a little annoyed in some parts of the article, that's just because I wanted to share the reality of the development process. 😄

As a small gift for reading this article, you can check out the GitHub repository with the full source code and all commits.
Bye :)

Source Code ^_^

Top comments (1)

Saveyourproject • Jun 27

This is my favorite kind of project. Reviving old gear is its own special kind of debugging. One thing that's saved me on vintage hardware: write down why each fix worked, which connector, which timing, before you move on. Those machines never document themselves and future-you forgets fast. What was trickier, the electrical side or the original firmware behavior?