Most "Causal" ML Is Just Correlation with Extra Steps
Here's a take that'll ruffle some feathers: 90% of production "causal inference" I've seen is regression with a fancier name. Teams slap on DoWhy, run estimate_effect(), and ship whatever number falls out—without understanding what the library actually does under the hood.
The result? Causal claims built on sand.
I'm not saying DoWhy is bad. It's genuinely excellent. But treating it as a black box defeats the entire purpose. The power of causal inference comes from making your assumptions explicit—and you can't do that if you don't understand what assumptions DoWhy is making for you.
So let's build a minimal causal inference engine from scratch, then reverse-engineer DoWhy's internals to see how the real thing works.
The Four-Step Pipeline That DoWhy Actually Runs
DoWhy follows a workflow that looks deceptively simple:
Model → Identify → Estimate → Refute
But each arrow hides substantial complexity. Here's what's actually happening.
Continue reading the full article on TildAlice

Top comments (0)