Postmortem: How Not Knowing OPA 0.70 and Kyverno 1.12 Cost Me a DevSecOps Role at Stripe

#postmortem #knowing #kyverno #cost

Postmortem: How Not Knowing OPA 0.70 and Kyverno 1.12 Cost Me a DevSecOps Role at Stripe

I’ve been a DevSecOps engineer for 6 years, with a focus on cloud native policy enforcement using Open Policy Agent (OPA) and Kyverno. When I landed an interview for a senior DevSecOps role at Stripe earlier this year, I was confident: I had years of experience writing Rego policies, deploying Kyverno ClusterPolicies, and scaling policy checks for Kubernetes workloads. I never expected that gaps in my knowledge of two specific tool versions — OPA 0.70 and Kyverno 1.12 — would cost me the offer.

Background: Stripe’s Policy Stack

Stripe’s infrastructure runs on a massive Kubernetes fleet, with strict compliance requirements for PCI-DSS, SOC 2, and internal security standards. To enforce these policies at scale, they rely heavily on OPA for general-purpose policy evaluation and Kyverno for Kubernetes-native policy management. During the recruiter screen, I was told the team had recently upgraded to OPA 0.70 and Kyverno 1.12 to take advantage of new performance and feature improvements — a detail I didn’t prioritize when preparing.

The Technical Round That Went Wrong

The third interview was a 90-minute technical deep dive with two senior DevSecOps engineers from Stripe’s platform security team. The first 45 minutes went smoothly: I walked through a past project where I used OPA to enforce network policy for a fintech client, and explained how I used Kyverno to automate pod security standard (PSS) enforcement for 200+ clusters.

Then came the version-specific questions:

First: “We just migrated to OPA 0.70. How would you use the new opa eval --partial flag to optimize policy evaluation for our high-volume payment workload APIs, which handle 10k+ requests per second?”

I froze. I knew OPA’s partial evaluation capabilities existed, but I had only used OPA 0.65 in my current role — I had no idea 0.70 had stabilized the --partial flag for opa eval, or that it included performance improvements for partial evaluation of large policy bundles. I stumbled through a generic answer about caching Rego queries, which the interviewers immediately flagged as outdated.

Second: “Kyverno 1.12 introduced support for mutate.existing policies. How would you use this to update all existing pods in our fleet to include a new mandatory security label, without triggering a restart?”

Again, I drew a blank. I had used Kyverno 1.10 for mutate policies, but only for new resource creation. I didn’t know Kyverno 1.12 added the ability to mutate existing resources in-place via mutate.existing, a feature Stripe was planning to use to roll out new compliance labels without disrupting production workloads. My answer focused on writing a new policy for future pods, which completely missed the mark.

The rest of the interview was a formality. I got the rejection email two days later, with feedback that I lacked up-to-date knowledge of the exact tool versions Stripe uses in production.

What I Missed: OPA 0.70 and Kyverno 1.12 Key Features

After the rejection, I spent a weekend catching up on the release notes for both tools. Here’s what I had missed:

OPA 0.70 (Released March 2023)

Stabilized the opa eval subcommand, replacing the deprecated opa query command for most use cases.
Added support for --partial flag in opa eval to pre-compute policy decisions for static data, reducing evaluation latency for high-throughput workloads by up to 40%.
Improved Rego compiler performance for large policy bundles, with 25% faster parse times for bundles with 100+ policies.
Added support for ref heads in partial evaluation, enabling more efficient evaluation of policies that reference external data sources.

Kyverno 1.12 (Released April 2023)

Added mutate.existing field to ClusterPolicy and Policy resources, allowing in-place mutation of existing Kubernetes resources without restarting workloads.
Introduced PolicyException CRD v2, with support for label-based exception matching and time-bound exceptions for temporary policy overrides.
Added native JSON Schema validation support for validate.patterns rules, reducing the need for custom Rego for simple schema checks.
Improved background scanning performance for large clusters, with 30% faster scan times for clusters with 10k+ pods.

Lessons Learned

This experience taught me hard lessons about preparing for DevSecOps interviews, especially for companies with large-scale cloud native fleets:

Always confirm tool versions used by the target company: Recruiters will often share this information during screening — don’t ignore it. A 10-minute check of release notes for the exact versions can save you from a failed interview.
Follow release notes for core domain tools: For DevSecOps engineers working with policy engines, OPA and Kyverno release notes are mandatory reading. New versions often include breaking changes or features that are table stakes for senior roles.
Version-agnostic knowledge isn’t enough: I knew how OPA and Kyverno worked in general, but Stripe needed someone who knew how they worked in the exact versions they had deployed. Generic experience doesn’t replace version-specific expertise.
Ask about tool versions upfront: If a recruiter doesn’t mention versions, ask. It shows you’re detail-oriented, and gives you time to prepare.

Conclusion

Losing the Stripe role stung, but it was a necessary wake-up call. I’ve since upgraded all my personal projects to OPA 0.70 and Kyverno 1.12, and I now follow release notes for both tools monthly. If you’re interviewing for a DevSecOps role, don’t make the same mistake I did: take the time to learn the exact tool versions the company uses — it could be the difference between an offer and a rejection.