DEV Community

Cover image for 78% of PyTorch Models Never Reach Production. I Built the Fix.
Anil Prasad
Anil Prasad

Posted on • Originally published at Medium

78% of PyTorch Models Never Reach Production. I Built the Fix.

78% of PyTorch Models Never Reach Production. I Built the Fix.

After 28 years shipping AI at scale, I got tired of watching good models die on the way to production.

By Anil S. Prasad — Founder, Ambharii Labs | Head of Engineering & Product, Duke Energy | Top 100 Most Influential AI Leaders USA 2024


There is a number that has followed me across every organization I have worked in. UnitedHealth Group, Medtronic, Ambry Genetics, R1 RCM, Duke Energy. The number is 78.

Seventy-eight percent of PyTorch models built in research never make it to production.

This is not a data science problem. The data scientists are talented. The models are good. The math works.

The problem is everything around the model. The audit trail that regulators demand. The compliance framework that legal requires. The drift detection that ops needs at 3am. The fairness analysis that the board is now asking about. The explanation that a clinician, an underwriter, or a grid operator needs before they trust the output.

None of that is in PyTorch. And nobody was building it.

So I did.


Introducing TorchForge

TorchForge is an open source enterprise governance wrapper for PyTorch. You take any model you have already built and wrap it in four lines of code. What comes back is the same model, same weights, same architecture, with a full production governance layer running underneath it.

Two-point-five percent overhead. That is all it costs.

from torchforge import ForgeModel, ForgeConfig

config = ForgeConfig(
    model_name="credit_risk_v2",
    version="1.0.0",
    enable_governance=True,
    compliance_framework="NIST_RMF_1.0"
)

model = ForgeModel(your_pytorch_model, config)
output = model(x)

# Audit trail: live.
# Drift detection: live.
# Compliance reporting: live.
# That's it.
Enter fullscreen mode Exit fullscreen mode

No refactoring. No retraining. No new infrastructure team required.


What You Get Out of the Box

I want to be precise here because this is where most governance tools either overpromise or underdeliver.

NIST AI RMF 1.0 compliance tracking. Every inference is logged against the seven functions of the NIST Risk Management Framework. Govern, Map, Measure, Manage. The report generates automatically. When a regulator asks for your AI risk documentation, you export it in one command.

Real-time drift detection with automatic alerts. TorchForge monitors input distribution and output distribution on every inference pass. When drift exceeds configurable thresholds, it fires alerts to Slack, PagerDuty, or any webhook you point it at. No separate monitoring pipeline. No Evidently setup. No manual dashboards.

Bias and fairness analysis on every prediction. Demographic parity, equalized odds, individual fairness metrics run as part of the inference pass. Not as a post-hoc audit you remember to do quarterly. On every prediction. Because bias does not wait for your audit schedule.

Full audit trail from training to deployment. Every model version, every config change, every inference batch is logged with timestamp, input hash, output, confidence scores, and the governance metadata. Immutable. Queryable. Exportable.

One-click deployment to five clouds. AWS, Azure, GCP, Kubernetes, Oracle Cloud. The deployment module generates the Terraform, the Helm chart, and the GitHub Actions pipeline. Your ops team gets a clean artifact, not a Jupyter notebook printed to PDF.

A/B testing with gradual rollout. Define your champion and challenger models. Set a traffic split. TorchForge handles the routing, collects the performance metrics, and helps you decide when to promote. No feature flags library required.


Why I Built This Now

Three things converged in 2025 that made this the right moment.

First, the regulatory pressure on AI is no longer hypothetical. The EU AI Act is in force. US federal agencies have issued AI governance guidance. State-level bills are passing. Every organization I talk to is scrambling to answer the same question: how do we prove our AI is trustworthy? TorchForge makes that question answerable.

Second, PyTorch won. It is the dominant research framework and increasingly the production framework. ONNX and TorchScript made serving easier. But nobody solved governance at the framework layer. Everyone solved it at the infrastructure layer, which means it is always bolted on, never built in.

Third, I kept meeting talented ML engineers who had the same story. They built something that worked. Leadership approved it. Then it went to compliance, and it sat there for six months because nobody could answer the audit questions. TorchForge is the answer you hand compliance the day the model is ready.


The Performance Story

I know what you are thinking. Governance overhead sounds expensive.

Here is what we measured in production-equivalent workloads:

Operation TorchForge Pure PyTorch Overhead
Forward pass 12.3ms 12.0ms 2.5%
Training step 45.2ms 44.8ms 0.9%
Inference batch 8.7ms 8.5ms 2.3%

Two-point-five percent on a forward pass. On a GPU cluster running ten thousand inferences per second, that is 250 extra milliseconds of total compute per second across the cluster. In exchange for full NIST compliance, continuous drift detection, bias monitoring, and a complete audit trail.

That is not a trade-off. That is a deal.


Who This Is For

If you are a solo researcher building hobby projects, TorchForge is overkill. Use it anyway if you want to learn the patterns, but it is not built for you.

TorchForge is built for three audiences.

ML engineers at regulated companies. Healthcare, financial services, energy, insurance. If your model touches a human life, a financial decision, or critical infrastructure, you need this. The compliance cost of not having it is orders of magnitude higher than the 2.5% overhead.

ML platform teams at growth-stage companies. You are scaling from one model to fifty. You need standardization. TorchForge is the standard. Every team wraps their model the same way, and you get a unified governance view across the entire portfolio.

AI consultants and system integrators. When you deliver a PyTorch model to a client and it comes with TorchForge, you are delivering a production-ready artifact, not a prototype. That changes the conversation about what you charge and what the client owns.


The Open Core Model

TorchForge is MIT-licensed. The core is free and always will be. I believe governance tooling should be open because the alternative is that only large companies with large procurement budgets can ship trustworthy AI. That is a bad outcome for the field.

The enterprise platform, the autonomous correction agents, the multi-tenant dashboard, the SLA-backed support, the private deployment options, those are available through Ambharii Labs. If you need them, you know where to find me.

But the open core does everything I described above. The compliance tracking, drift detection, bias analysis, audit trail, deployment tooling, A/B testing framework. All of it. No license key required.


Try It in Three Minutes

pip install torchforge
Enter fullscreen mode Exit fullscreen mode
import torch
import torch.nn as nn
from torchforge import ForgeModel, ForgeConfig

# Your existing model, unchanged
class YourModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.Linear(64, 2)
        )
    def forward(self, x):
        return self.net(x)

# Wrap it
config = ForgeConfig(
    model_name="my_first_governed_model",
    version="1.0.0",
    enable_governance=True,
    compliance_framework="NIST_RMF_1.0"
)

model = ForgeModel(YourModel(), config)

# Run inference — governance is automatic
x = torch.randn(4, 128)
output = model(x)

# Export your compliance report
model.export_compliance_report("./compliance_report.json")
Enter fullscreen mode Exit fullscreen mode

That is three minutes from install to your first NIST-compliant inference.

The live demo runs on Hugging Face Spaces at no cost to you. Go to huggingface.co/spaces/AmbhariiLabs/torchforge-demo and run a governed inference in your browser before you install anything.


What Comes Next

The roadmap for TorchForge is driven by what I see breaking in production, not by what sounds impressive in a conference talk.

Q2 2026: Federated learning support with differential privacy guarantees. For healthcare and financial services teams who cannot centralize training data.

Q3 2026: LLM governance extension. The same wrapper pattern applied to fine-tuned language models. Hallucination rate tracking, toxicity monitoring, prompt injection detection.

Q4 2026: Cross-framework support. The governance layer decoupled from PyTorch so it can wrap TensorFlow, JAX, and ONNX models with the same four-line interface.

Everything will stay open core. That is not a marketing promise. It is the design constraint I set before writing the first line of code.


Links

GitHub: github.com/anilatambharii/torchforge

PyPI: pip install torchforge

Live demo: huggingface.co/spaces/AmbhariiLabs/torchforge-demo

Enterprise: ambharii.com

Connect: linkedin.com/in/anilsprasad


Built by Anil S. Prasad — Founder, Ambharii Labs. 28 years of production AI across UnitedHealth Group, Medtronic, Duke Energy, Ambry Genetics, and R1 RCM. Co-Founder of the CDAIO Circle Tri-State Chapter. Stanford and BITS Pilani.

#HumanWritten #ExpertiseFromField #PyTorch #MLOps #EnterpriseAI #AIGovernance #OpenSource #NIST #ProductionAI


Cross-posting note: This article is published on Medium (@anilAmbharii). Canonical version lives at medium.com. If you found this on DEV.to, Substack, or LinkedIn, follow me there for more field notes from 28 years of production AI.

Top comments (0)