Mohammed Ali Chherawalla

Posted on Mar 17

How Do You Get 3x Engineering Velocity at a Health Insurance Company Using AI?

#ai #softwaredevelopment #softwareengineering

The request sounds simple when it lands on your desk. The CEO wants the engineering team to move faster. The board has seen what AI can do. Competitors are shipping features that took you 6 months in what looks like 6 weeks. The ask is some version of "get us 3x faster using AI."

The problem is that "3x faster" is not one thing. It is a dozen different things that compound. And if you get the sequence wrong, you end up with engineers who have shiny AI tools and an organization that ships at exactly the same speed.

We have helped health insurance engineering teams at Wednesday Solutions work through this exact problem. Here is what 3x actually means when you break it down, and the order in which the gains stack.

Why "3x Faster" Is the Wrong Frame

When your CEO says "3x faster," they mean one of three things. Usually all three.

Ship features faster. A feature that takes 3 months should take 1 month.

Ship more features. The team that ships 4 features per quarter should ship 12.

Ship with fewer problems. The bugs, the outages, the rollbacks that eat up engineering time should drop significantly.

These are related but they are not the same. And they do not all respond to the same interventions. A team that writes code 3x faster but still waits 5 days for code review is not 3x faster. A team that deploys 3x more often but has 3x more production bugs is not better off. A team that automates testing but still spends 2 weeks planning each release has not solved the problem.

3x engineering velocity means the entire system moves 3x faster. From the moment a feature enters development to the moment a user is using it in production. Every phase in between has to speed up, or the slowest phase becomes the new bottleneck.

This is why the tool-first approach fails. Buying Copilot licenses makes the coding phase faster. But if coding was only 20% of the total cycle time, you have made 20% of the process 2x faster. That is a 1.2x improvement. Not 3x.

Where the Time Actually Goes

Before you can get 3x faster, you need to know where the time is going. In most health insurance engineering teams we have worked with, the breakdown looks roughly like this:

Understanding the system and planning: 20 to 25% of total cycle time. Engineers reading code, tracing data flows, asking senior engineers questions, understanding how the new feature interacts with the existing claims processing logic.

Writing code: 15 to 20% of total cycle time. The part everyone thinks about. The part that AI coding assistants address. It is the smallest chunk.

Code review: 15 to 20% of total cycle time. Mostly waiting. The code is done. It sits in a queue because the 3 people who can review it are busy. When they do review, half the feedback is mechanical issues that could have been caught automatically.

Testing: 20 to 25% of total cycle time. Manual test case creation. Manual test execution. Manual regression testing before each release. Debugging flaky tests. Maintaining test data. For health insurance platforms with thousands of edge cases in claims processing, this is often the single biggest time sink.

Deployment and stabilization: 10 to 15% of total cycle time. Scheduling deployment windows. Running through deployment checklists. Monitoring after deployment. Fixing things that break. Rolling back when necessary.

When you see it broken down like this, the path to 3x becomes clear. You cannot get there by only making the coding phase faster. You have to compress every phase. And you have to start with the phases that consume the most time relative to how much AI can compress them.

The Velocity Stack: Where to Start and in What Order

Layer 1: Automated Code Review (Week 1-2)

This is where every health insurance engineering team should start. Not because it produces the biggest gain, but because it produces the fastest gain with the lowest risk.

Automated review tools do not touch production. They do not change your code. They sit between your engineer's pull request and your senior engineer's review. They catch the mechanical issues: inconsistent naming, missing error handling, security vulnerabilities, style violations, common bugs. Engineers fix these before the human review starts.

The immediate effect: your senior engineers stop spending time on mechanical feedback. Review cycle time drops 50%. Pull requests that sat in a queue for 3 to 5 days now get through in 4 to 8 hours. Your senior engineers get 15 to 20% of their week back.

The compounding effect: those senior engineers are now available to work on the complex features, the architecture decisions, and the legacy modernization that has been sitting in the backlog. Their freed-up time is worth more than the review speed improvement itself.

For a health insurance engineering team, this means the 3 people who understand the claims processing system are no longer the bottleneck on every pull request. They still review. They review the things that matter: business logic correctness, architectural implications, regulatory considerations. The mechanical stuff is handled before it reaches them.

Layer 2: Automated Testing (Week 2-6)

This is where the biggest single gain lives for health insurance teams.

Manual testing of a claims processing platform is a nightmare. The combinatorial explosion of policy types, plan tiers, provider networks, co-pay structures, deductible calculations, and coordination of benefits means comprehensive testing is practically impossible by hand. Teams test the happy path and the top 20 edge cases. Everything else is a gamble.

AI-automated testing compresses this in two ways.

API-level testing tools sit at the network layer, capture real traffic patterns, and generate tests automatically. They cover scenarios that no human would have written test cases for because they test based on observed behavior, not hypothesized behavior.

End-to-end testing tools use vision-based approaches to validate what users see. They do not break when a CSS class changes or a button moves. They test the experience, not the implementation. Flaky tests, which consume enormous maintenance time, effectively disappear.

The immediate effect: regression testing that took 2 to 3 weeks runs in hours. Test coverage expands from "the top 50 scenarios" to "every scenario the system has processed." Production bugs drop 60 to 75%.

The compounding effect: when your regression suite runs in hours, you no longer need to batch releases. The entire rationale for quarterly releases collapses. You can test and release individual features. This unlocks everything downstream.

Layer 3: Agent Skills (Week 4-8)

This is the investment that makes every other layer compound faster.

Agent skills are structured knowledge packs that teach AI tools how your specific system works. They contain your architecture decisions, business rules, data models, coding conventions, and constraints. Everything a senior engineer would know after 6 months on the team, packaged so that every AI tool operates with system-specific context.

For a health insurance platform, agent skills capture: how the claims adjudication pipeline works, the sequencing requirements for different claim types, the provider network data model, the co-pay and deductible calculation logic across plan tiers, the legacy system integration points, and the constraints that every engineer needs to know but nobody has written down in one place.

The immediate effect on code generation: engineers stop spending 2 to 3 days understanding the system before each feature. AI-assisted development produces code that follows your team's conventions, handles your system's edge cases, and integrates correctly with your existing architecture. Development time per feature drops 50 to 60%.

The immediate effect on onboarding: a new engineer with access to agent skills contributes meaningfully in their first week instead of their first month. For a health insurance team that struggles with hiring and retention, this is transformative.

The compounding effect: agent skills make every other AI tool better. Code review is more accurate because the AI reviewer understands your system. Test generation is more relevant because the AI tester knows your edge cases. Code generation is more consistent because every engineer works from the same context. The gains from Layers 1 and 2 multiply.

Layer 4: AI-Assisted Deployment and Monitoring (Week 6-12)

Once your code is being reviewed faster, tested more thoroughly, and written with system-specific context, the next bottleneck is deployment.

Health insurance platforms cannot afford downtime. This makes engineering teams conservative about deployments. Releases are scheduled in advance, deployed during off-hours, and monitored manually. Every deployment carries anxiety because the blast radius of failure is large.

AI-assisted deployment changes the risk calculus. Monitoring tools learn the normal behavior patterns of your system. They detect anomalies that static thresholds miss. Automated rollback triggers when error rates spike. Recovery time drops from hours to minutes.

We stabilized a health insurance platform that was crashing 4 hours daily during peak season. Three weeks to zero downtime. Once the platform was stable, the team moved from scheduled monthly deployments to on-demand deployments. Each deployment was small, well-tested, and monitored in real time. The fear went away because the safety net was reliable.

The immediate effect: deployment frequency increases because each deployment is smaller and the team trusts the monitoring and rollback systems. Features reach production days after completion instead of weeks or months.

The compounding effect: faster feedback loops. When a feature goes live days after it is built instead of months, the team learns whether it works immediately. Rework drops because problems are caught while the context is still fresh. The engineer who built the feature is still thinking about it when the feedback comes in.

Layer 5: Process Codification and Continuous Improvement (Ongoing)

This is not a one-time step. It is the practice that makes 3x sustainable instead of temporary.

The DORA 2025 report, based on a survey of roughly 5,000 technology professionals, found that AI amplifies what is already there. If your processes are strong, AI multiplies the gains. If your processes are weak, AI amplifies the chaos.

This means your processes need to evolve alongside your AI adoption. As automated testing reveals new edge cases, document them. As code review patterns emerge, update the standards. As deployment failures happen (they will, rarely), conduct blameless postmortems and update the agent skills.

The teams that sustain 3x velocity are the teams that treat their processes as living documents, not static artifacts. They update their agent skills monthly. They review their automated test coverage quarterly. They measure the five DORA metrics (deployment frequency, lead time for changes, change fail rate, recovery time, rework rate) and course-correct when something slips.

What 3x Looks Like in Numbers

Across the health insurance engineering teams we have worked with, here is what the velocity stack typically produces over 6 months:

Development cycle time per feature: from 3 to 5 weeks down to 1 to 1.5 weeks. Driven by agent skills eliminating context loading and AI-assisted code generation.

Code review cycle time: from 3 to 5 days down to 4 to 8 hours. Driven by automated first-pass review.

Regression testing time: from 2 to 3 weeks down to 2 to 4 hours. Driven by AI-automated testing.

Deployment frequency: from quarterly to weekly or bi-weekly. Driven by smaller releases, better testing, and AI monitoring.

Production bug rate: down 60 to 75%. Driven by comprehensive automated testing.

Recovery time: from hours to under 15 minutes. Driven by AI anomaly detection and automated rollback.

Engineer onboarding time: from 2 to 3 months to 1 to 2 weeks for meaningful contribution. Driven by agent skills.

Total cycle time from "feature starts" to "feature in production": from 8 to 12 weeks down to 2 to 3 weeks. That is the 3x.

The Two Things That Can Kill Your 3x

Killer 1: Adopting AI Without Codified Processes

If your code review standards live in one senior engineer's head, automated review cannot replicate them. If your testing strategy is "whatever the engineer thinks is important," automated testing cannot generate meaningful tests. If your deployment process is tribal knowledge, AI monitoring cannot know what normal looks like.

The prerequisite for every layer in the velocity stack is written-down, agreed-upon processes. Not heavy documentation. Not 50-page runbooks. A clear, concise rubric for what good looks like at each phase. One page per phase is enough. But it has to exist.

Killer 2: Rolling Out to Everyone at Once

The second most common failure mode is trying to move 200 engineers to AI-assisted development simultaneously. Different teams have different maturity levels. Different teams have different codebases. What works for the claims processing team may not work for the member portal team.

The DORA 2025 report identified seven distinct team archetypes. The team that is understaffed but has strong practices (archetype 4) will see immediate gains. The team that is battling organizational friction (archetype 3) will not, because their bottleneck is governance and process, not engineering speed. The team with low morale and broken systems (archetype 1) should not adopt AI at all until the fundamentals are fixed.

Start with one team. The disciplined, understaffed team. Get them to 3x. Document what worked. Use them as the internal case study. Then expand to the next team with the same approach, adjusted for that team's specific bottlenecks.

The Conversation With Your CEO

When your CEO asks for 3x, here is what you can tell them.

3x is achievable within 6 months for a pilot team. The investment is modest: AI tooling costs less per month than a single contractor. The real investment is 2 to 4 weeks of senior engineer time building agent skills and the discipline to roll out in phases.

The returns are measurable. Track the five DORA metrics before and after. You will see deployment frequency increase, lead times shrink, change fail rates drop, recovery times compress, and rework rates decline. These are not subjective improvements. They are numbers on a dashboard.

The risk is low if you sequence correctly. Start with automated code review, which does not touch production. Then automated testing. Then agent skills. Then deployment optimization. Each layer proves the previous one before adding complexity.

At Wednesday Solutions, we help health insurance engineering teams build this velocity stack. We start with a contained engagement: one team, one layer, measurable results. The proof point expands the conversation. We have a 4.8/5.0 rating on Clutch across 23 reviews, with insurance and financial services companies among our longest-running partnerships, because the results compound and the relationship deepens.

3x is not a slogan. It is a stack. Build it in the right order and the math works.

Frequently Asked Questions

What does 3x engineering velocity actually mean for a health insurance company?

It means the total time from when a feature enters development to when it is live in production drops to one-third of what it currently takes. For most health insurance teams, that means going from 8 to 12 week cycles down to 2 to 3 weeks. It is not about typing code faster. It is about compressing every phase: context loading, code review, testing, deployment, and stabilization.

Where should a health insurance engineering team start to increase velocity with AI?

Automated code review. It is the lowest-risk, fastest-return starting point. It does not touch production code. It catches mechanical issues before human review, freeing senior engineers to focus on architecture and business logic. Most teams see review cycle time drop 50% within the first sprint. That freed-up senior engineer time is worth more than the speed improvement itself.

How much of engineering cycle time is actually spent writing code at health insurance companies?

Typically 15 to 20%. The rest is context loading (20-25%), code review (15-20%), testing (20-25%), and deployment and stabilization (10-15%). This is why AI coding assistants alone do not produce 3x gains. They accelerate the smallest phase. To get 3x, you need AI across every phase, starting with the ones that consume the most time.

What are agent skills and how do they contribute to 3x velocity?

Agent skills are structured knowledge packs that teach AI tools how your specific health insurance platform works. They contain business rules, architecture decisions, data models, and constraints. They eliminate the 2 to 3 days engineers spend understanding the system before each feature, improve the relevance of AI-generated code, make automated code review more accurate, and reduce new engineer onboarding from months to weeks. They are the multiplier that makes every other AI tool more effective.

Can you get 3x velocity without changing your release process?

No. If you are on quarterly releases, even perfect AI adoption across code generation, review, and testing will not get you to 3x. The batching itself creates waste: longer feedback loops, larger blast radii, more rework. The velocity stack requires moving to smaller, more frequent releases. This is less risky, not more, because each release is small enough to understand, test, and roll back quickly.

How long does it take to achieve 3x engineering velocity at a health insurance company?

Six months for a pilot team, following the velocity stack in order. Automated code review in weeks 1 to 2. Automated testing in weeks 2 to 6. Agent skills in weeks 4 to 8. AI-assisted deployment in weeks 6 to 12. Each layer produces measurable gains independently, and the gains compound as layers stack. Expanding to additional teams takes another 3 to 6 months depending on team maturity.

What is the biggest risk of trying to achieve 3x velocity with AI?

Rolling out to the entire engineering org at once. Different teams have different maturity levels and different bottlenecks. A one-size-fits-all AI rollout to 200 engineers produces frustration and abandoned tools. Start with one disciplined team, get measurable results, then use that team as the internal proof point for expanding to others.

How do you measure whether the 3x goal is being achieved?

Track the five DORA performance metrics: deployment frequency, lead time for changes, change fail rate, recovery time, and rework rate. Baseline them before AI adoption. Measure monthly. If deployment frequency is increasing, lead times are shrinking, change fail rate is dropping, recovery time is compressing, and rework rate is declining, you are on track. If any metric is flat, investigate which phase of the SDLC is the current bottleneck.

DEV Community