DEV Community

Cover image for KitOps: A Practical Approach to Accelerating AI/ML Development to Production
Brad Micklea for KitOps

Posted on • Originally published at jozu.com

KitOps: A Practical Approach to Accelerating AI/ML Development to Production

Every company and executive is talking about accelerating their success with AI/ML. But between the aspiration and reality is a huge gulf. For large tech companies they’re spending their way across: hiring data scientists, ML engineers, MLOps practitioners, and a host of other very high-paid roles and putting them in new divisions or AI Centres of Excellence (AI CoEs).

This new CoE looks suspiciously like the software engineering organization that it sits beside. This shouldn’t be surprising because many of the problems plaguing AI/ML development are problems that faced software development years ago: how to create effective collaboration, how to reproduce experiments across different teams and environments, how to secure projects, how to know where IP came from...

The idea that companies need a whole new set of teams starting from the ground up to solve these issues for AI/ML is...odd. Why can’t we use the solutions learned from years of software development?

We can...and should.

The trick is identifying which parts of the existing infrastructure, tools, and processes can be adapted to work with AI/ML and which genuinely require a new approach.

AI/ML Project Packaging and Versioning

There is no standard for packaging and versioning the various artifacts needed to reproduce an AI/ML project:

  • Models in Jupyter notebooks or MLOps tools
  • Datasets in data lakes, databases, or file systems
  • Code in Git repositories
  • Metadata (such as hyperparameters, features, and weights) scattered across different storage systems

That’s insane - in every other area of technology there are these basic standards so teams can use different tools while collaborating: containers, PDFs, TARs, there are hundreds of other examples.

Without this, the already challenging job of managing the lifecycle of AI projects is even harder, and more likely to fail.

We created KitOps to solve this problem. It’s an open-source, standards-based packaging and versioning system designed for AI/ML projects. KitOps takes advantage of existing software standards so the tools and processes your DevOps / SRE teams use with their containerized applications, can be used with AI/ML projects.

KitOps allows AI teams to package AI/ML models, datasets, code, and metadata into what’s called a ModelKit. This ModelKit is a versioned bundle that can be sent to staging, user acceptance testing (UAT), or production. For DevOps and SRE teams it means they can manage AI models like any other production asset, such as containers or microservices.

For companies that are using containers and enterprise registries, KitOps is a seamless fit. KitOps ModelKits use the OCI standard - because it doesn’t make sense to create a unique standard when an existing one works. With KitOps DevOps and SRE teams manage AI artifacts like models, datasets, code, and metadata from the same registry where every other artifact is.

Let’s dive into how KitOps can streamline AI/ML operations and solve some of the unique challenges faced by companies adopting machine learning.

Level 1: Simplifying the Handoff from Development to Production

With KitOps integrated into your CI/CD pipeline, you can automate the deployment of AI models, either manually triggered by the model development team or automatically when changes are made in the model or its associated artifacts. Here’s why this matters:

  • Unified Operations: All assets are in one place, making it easier for operations teams to test, deploy, audit, and manage AI workloads.
  • Compliance and Security: By keeping AI versioned packages in the same enterprise registry as other production assets, they’re easier to secure and audit, which is crucial for compliance purposes.
  • Vendor Independence: Using the OCI standard protects your company from vendor lock-in, giving you flexibility when negotiating with vendors and adapting your MLOps or serving infrastructure as your needs evolve.

For companies dipping their toes into AI/ML, this is the first step with KitOps, but many find themselves wanting to extend its use.

Level 2: Adding Security to the Mix

Security is a top concern, especially when dealing with sensitive data or compliance requirements like GDPR or HIPAA. KitOps, in combination with the open-source ModelScan tool, allows your team to scan models for vulnerabilities before they’re promoted beyond development and package them in a tamper-proof ModelKit.

Security-conscious teams can create curated ModelKits by pulling trusted models from public repositories like Hugging Face, scanning them, and storing them as ModelKits in the enterprise registry. This guarantees that only tamper-proof, verified models are used within the organization. This level of security isn’t just important for peace of mind — it’s increasingly a requirement in industries subject to strict regulatory oversight.

Once a model passes scanning, it’s packaged and signed as a ModelKit, ensuring that it can’t be tampered with on its way to production.

Level 3: Full Lifecycle Management with KitOps

For companies looking to mature their AI operations or meet stringent compliance standards (like in the European Union), KitOps can be used throughout the entire AI project lifecycle. Instead of waiting until production, you can start using ModelKits in development.

This approach solves many common pain points:

  • Unified Storage: All AI/ML project artifacts are stored as OCI-compliant objects, ensuring a consistent and standardized process across the organization.
  • Collaboration Across Teams: Since data, AI/ML, and software teams work in different tools, using KitOps ensures that they can share and collaborate on artifacts securely and efficiently without compromising on their chosen toolset.
  • Tamper-Proof Artifacts: By storing development artifacts as ModelKits, you protect them from accidental or malicious tampering — an increasingly common concern in AI development.

Best of all, this process can be automated using the KitOps CLI, meaning your team won’t have to manually track every update or model change.

A Real-World Example: KitOps in Action

Let’s walk through how a typical company might use KitOps from development to production.

  1. Development: Data scientists work in Jupyter notebooks, but save their work as a ModelKit at every milestone using simple KitOps commands:

kit pack . -t registry.gitlab.com/chatbot/legalchat:tuned

kit push registry.gitlab.com/chatbot/legalchat:tuned

This ensures that all models, datasets, and code are versioned and stored in a secure, centralized location.

  1. Integration: The application team pulls the updated ModelKit:

kit pull registry.gitlab.com/chatbot/legalchat:tuned

They test service integration, paying attention to performance and any changes between model versions.

  1. Testing: Once integration is complete, an engineer validates the model by running it with the included validation dataset:

kit unpack registry.gitlab.com/chatbot/legalchat:tuned --model --datasets

  1. Deployment: When ready for production, the SRE team tags the ModelKit as challenger and pushes it into the CI/CD pipeline for deployment:

kit tag registry.gitlab.com/chatbot/legalchat:tuned registry.gitlab.com/chatbot/legalchat:challenger

kit push registry.gitlab.com/chatbot/legalchat:challenger

The deployment team monitors the new challenger model in production, and if it’s successful, they re-tag it as champion. If there are any issues, the previous model can be rolled back with minimal friction.

Conclusion: Why KitOps Should Be On Your Radar

Adopting AI/ML doesn’t have to mean adding new teams and infrastructure. KitOps offers a structured, secure, and scalable solution for managing the entire lifecycle of AI projects. It integrates seamlessly with existing container infrastructure, uses open standards, and provides critical security and auditing features that will help your team stay compliant and efficient.

Top comments (0)