Paul Okhrem on predictive maintenance in industrial operations: the hard part was not the ML model

#webdev #ai #programming #productivity

The ML model was ready in six weeks.

Getting it to actually change maintenance behavior took another eight months.

That's the gap nobody talks about when predictive maintenance projects get pitched. The data science part — training a model to detect early failure signals in sensor data — is genuinely the easier half. The hard part is everything that happens between "the model is accurate" and "a technician does something different because of it."

What the model actually does

For context: predictive maintenance models typically ingest time-series data from industrial equipment — vibration, temperature, pressure, acoustic signatures — and identify patterns that precede failure. A bearing that's about to fail vibrates differently than a healthy one. A motor running hotter than usual may be drawing more current than it should.

These signals exist in the data. Given enough historical examples of failures and their precursors, a well-trained model can surface them days or weeks before the failure would have been caught by scheduled maintenance or operator observation.

The model accuracy, in most mature implementations I've seen, is not the limiting factor. Precision and recall numbers look good. The ROC curve is satisfying. The data science team presents and everyone nods.

Then the model goes live and nothing changes.

The operational integration problem

Maintenance teams in industrial environments operate under specific constraints that most ML projects underestimate.

Work order systems are not designed for probabilistic inputs. A maintenance work order typically answers the question: what needs to be fixed, when, and by whom. Predictive models generate outputs like "85% probability of failure within 14 days." These don't map cleanly to existing workflows. Someone has to decide: at what threshold does an alert become a work order? Who decides that? What happens if the technician disagrees?

In one implementation at a manufacturing facility, the model was generating alerts that the maintenance team simply wasn't acting on — not because they didn't trust the model, but because there was no defined process for what to do with a prediction that wasn't a confirmed fault. The model was right. The workflow didn't exist yet.

Technician trust is not given, it's earned. Maintenance professionals have deep experience with their equipment. Many have been doing this for 20+ years. An algorithm telling them something is wrong when they can't hear, see, or feel it themselves is not immediately credible. Early in deployment, when the model flags something and the technician inspects and finds nothing obviously wrong, the default interpretation is "false alarm" — even if the model was correctly identifying a developing fault.

This is why the first few validated catches matter enormously. The moment a technician acts on a model alert, finds the developing fault, and avoids an unplanned failure — that story spreads. Trust is built case by case, not by model accuracy metrics.

Planners need lead time, not just alerts. The value of predictive maintenance isn't just avoiding failure — it's enabling planned downtime instead of unplanned downtime. But that requires the alert to come early enough, in a format that lets planners schedule the work, source the parts, and coordinate the downtime window. A 48-hour warning on a component that requires a 2-week parts lead time isn't useful. The integration with procurement and planning systems has to be designed from the start.

What the deployment actually required

By the time the predictive maintenance system was genuinely embedded in operations at that facility, the work had included:

Redesigning the alert routing so predictions fed directly into the CMMS (computerized maintenance management system) as conditional work orders, not separate notifications
Defining escalation thresholds with the maintenance team — not imposed by the data science team, but negotiated with the people who would act on them
Building a feedback loop where technicians could log what they found (or didn't find) when they investigated an alert, so model performance could be tracked and communicated back to them
Running a 3-month "shadow mode" where alerts were generated but not acted on, and outcomes were tracked, so the team could see how many failures the model would have predicted before they happened
Creating a simple dashboard that translated model outputs into language the maintenance team actually used — not confidence scores, but plain descriptions of what the model was detecting and why

None of this was data science. All of it was the difference between a model and a working system.

The broader pattern

This pattern — model accuracy ≠ operational impact — shows up across industrial AI implementations. The technical capability gets proven relatively quickly. The integration work takes longer, requires different skills, and is much less glamorous.

It also tends to be where budgets get cut. The model is done. Why are we still spending money on this?

Because the model is not the product. The changed behavior is the product. And changing behavior in complex operational environments requires understanding how those environments actually work — the workflows, the incentives, the trust dynamics, the planning constraints.

That's the work. The ML model is just the starting point.

Paul Okhrem writes about AI strategy and industrial operations. More at paul-okhrem.com