90% of developers now use AI tools at work. 80% say it makes them more productive. But when you look at organizational delivery metrics across health insurance companies, the numbers tell a different story. Individual engineers are faster. Teams are not shipping faster. Platforms are not more stable. Release cycles have not shortened.
This is the AI productivity paradox, and health insurance engineering teams are feeling it more than most.
We have spent the last two years helping insurance engineering teams close this gap at Wednesday Solutions. What follows is everything we have learned about where AI actually moves the needle across the software development lifecycle, and where it does not.
Why Health Insurance Engineering Teams Have It Harder
Health insurance is not a typical software environment. The engineering challenges are specific and compounding.
Your claims processing system was probably built 15 to 20 years ago. It still runs the business. Every new feature has to work with that system, not replace it. Your team spends more time navigating legacy constraints than writing new code.
You have seasonal pressure that most industries do not. Open enrollment periods and policy renewal windows create hard deadlines where the platform must perform. There is no "we will ship it next sprint." If the system crashes during peak season, revenue stops.
Your data volumes are massive. Millions of claims records, policy documents, provider networks. Every query, every API call, every report touches data that has to be accurate because people's healthcare depends on it.
And your engineering team is probably understaffed relative to the complexity they manage. Hiring is slow. Attrition is real. You are trying to do more with the same headcount.
This is exactly the environment where AI should help the most. But only if you adopt it the right way.
The One Thing That Determines Whether AI Works for Your Team
We worked with a $3 billion insurer whose core sales platform was crashing 4 hours daily during peak season. Before we touched a single line of code, we needed to understand something: did this team have their processes written down?
This is the single biggest predictor of whether AI adoption will succeed in your engineering org.
When your processes are codified, meaning you have a rubric for what good looks like in code reviews, testing, deployments, documentation, AI can automate and accelerate those processes immediately. When your processes live in one senior engineer's head and everyone else follows their instinct, AI amplifies the chaos.
The DORA 2025 report studied roughly 5,000 technology professionals and concluded the same thing: AI is an amplifier. Strong engineering practices plus AI equals multiplied gains. Weak engineering practices plus AI equals amplified chaos. Same tool. Same company. Completely different outcomes.
At a large insurance company we work with, the Head of Digital Technology recently introduced AI tools to his engineering team by framing it as a learning session, not a mandate. The message to the team was clear: this is about exploration, guardrails, and thoughtful experimentation. That framing matters. Rolling out AI as a top-down mandate to a team that does not have codified practices is a recipe for frustration.
So before you buy another AI license, ask yourself: if your best engineer left tomorrow, could the rest of the team maintain the same quality of output? If the answer is no, you have a process problem, not a tools problem. Fix that first.
Where AI Actually Moves the Needle in Health Insurance Engineering
Let us walk through each phase of the SDLC and show you exactly where AI creates real speed gains, with examples specific to health insurance.
Planning and Requirements
This is where most teams waste the least obvious time. A product manager writes requirements. An engineer reads them. They interpret them differently. Two weeks later, the wrong thing gets built.
AI does not replace planning. But it compresses the translation gap. The key is building what we call agent skills: structured knowledge packs that teach your AI tools how your specific system works. Architecture decisions, business rules, data models, coding conventions, deployment constraints. Everything a senior engineer would know after 6 months on the team, packaged so that every AI tool on your team operates with the same shared understanding.
For a health insurance platform, this means the AI understands that a claims adjudication workflow has specific sequencing requirements, that certain fields are mandatory for regulatory submissions, and that the provider network data refreshes on a specific cadence. The AI does not guess. It works from the same context your best engineer would have.
Code Generation and Architecture
This is where the biggest speed gains happen, but also where the biggest mistakes get made.
Naive AI adoption means an engineer opens Copilot and starts autocompleting code. It works for boilerplate. It falls apart for anything that requires understanding your specific system. DORA 2025 found that AI generates boilerplate 2 to 4 times faster. That is real. But boilerplate is not what slows down health insurance engineering teams. What slows you down is the complexity of integrating with legacy systems, handling edge cases in claims processing, and making sure new code does not break a 15-year-old workflow.
Structured AI adoption looks different. You build agent skills that teach the AI how your system works, then decompose tasks sequentially: explain the function, identify edge cases, write tests, then refactor. You break problems down step by step so the AI maps the logical flow before writing a single line of code. You feed it your team's actual testing patterns so the output is consistent with your codebase, not generic.
We did this for a legacy codebase modernization project. 1,113 directories. 2,355 files across 6 projects. Legacy C code. Zero documentation. SOAP APIs. The first attempt at using AI naively with Copilot produced partial results. The second attempt, with structured AI powered by agent skills that understood the full codebase, delivered in two weeks. Same tool. Different approach. That is the gap between AI adoption and AI leverage.
Code Review
This is the silent bottleneck in most health insurance engineering teams.
Your senior engineers spend 30 to 40% of their time reviewing pull requests. They are the only ones who understand the legacy systems well enough to catch integration issues. They become the bottleneck. Everything waits for their review.
AI-assisted code review does not replace your senior engineers. It handles the first pass. Automated review tools catch issues that humans miss due to volume: inconsistent naming, missing error handling, security vulnerabilities, style violations. They recommend fixes that engineers can accept with a single click. Engineers self-correct before the human review even starts.
The result: your senior engineers spend their review time on architecture decisions and business logic correctness, not on catching missing null checks. The review cycle compresses from days to hours.
At Wednesday Solutions, first-pass pull request reviews are now handled by automated AI tools. It reduces the back-and-forth significantly and lets our senior engineers focus on the decisions that actually matter.
Testing
This is where health insurance teams leave the most time on the table.
Manual testing for a claims processing platform is exhausting. The number of edge cases is enormous. Policy types, provider combinations, co-pay structures, deductible calculations, out-of-network scenarios. Writing test cases for all of this manually takes weeks. Running them takes days. And tests break constantly because the data changes.
AI-automated testing changes this completely. Vision-based testing tools take screenshots of your application, evaluate what the user is trying to do, and remove the flakiness that plagues traditional end-to-end tests. No engineer needs to write the end-to-end tests manually. API testing tools sit at the network layer, sniff all API requests, and automatically generate API tests using AI. Both approaches completely automate the test pipeline.
For a health insurance company processing millions of claims, this means your test coverage goes from "we test the happy path and hope for the best" to "we test every edge case, every time, automatically." That is where the 75% reduction in bugs comes from. Not from writing better code. From testing everything.
Deployment and Operations
Health insurance platforms cannot afford downtime. When your sales platform goes down during open enrollment, you lose revenue by the minute.
We stabilized an insurance platform that was crashing 4 hours daily during peak season. Phase 1 was a 3-week stabilization sprint that brought the platform to zero downtime. Revenue was protected immediately. Phase 2 was a year-long rebuild into a modular architecture, including replacing a $500,000 per year PDF processing setup with open-source alternatives, all engineered to work with a 40-year-old technology ecosystem.
AI plays a role here in monitoring, alerting, and incident response. When your deployment pipeline is AI-assisted, you catch failures faster, roll back faster, and recover faster. The DORA 2025 report identifies recovery time as one of the five key performance metrics. Top-performing teams recover in under one hour. Most health insurance teams take days.
Not Every Team Is Ready for AI
This is the part most AI consultants will not tell you. Some of your teams should not adopt AI yet.
The DORA 2025 report identified seven team archetypes. Two of them are directly relevant to health insurance engineering.
The first is what we call the "Smart People Trapped in Old Systems" archetype. Talented engineers. Legacy systems. Everything takes too long. This describes most health insurance engineering teams. AI can help here, but slowly. You need to decouple from legacy constraints first. If your engineers are spending 70% of their time fighting the system and 30% writing code, AI will make the 30% faster. That is a 15% improvement, not a 3x improvement. You need to fix the 70% first.
The second is the "Quality-First, Sometimes Too Slow" archetype. Calm. Reliable. Understaffed. These teams often see the fastest AI gains because their bottleneck is not bad practices, it is not enough hands. AI-assisted testing, code review, and documentation removes the speed tax from quality. We have seen these teams go from quarterly releases to monthly within 90 days of structured AI adoption.
Almost 40% of engineering teams fall into the top two performing archetypes. That means they are already ready to compound with AI. But you need to know which archetype each of your teams falls into before you roll out AI adoption. A one-size-fits-all rollout fails.
What 3x Actually Looks Like
Let us be specific about what "3x faster" means in practice for a health insurance engineering team.
It does not mean your engineers type code 3 times faster. It means:
Your release cycle compresses. If you are shipping quarterly, you move to monthly. If you are shipping monthly, you move to bi-weekly. This happens because testing is automated, code review is faster, and deployment pipelines are AI-assisted.
Your bug rate drops. We have seen 75% fewer production bugs in teams that adopt AI-automated testing. Every bug that does not make it to production is a support ticket that does not get filed, an investigation that does not happen, and a hotfix that does not disrupt the next sprint.
Your senior engineers get unblocked. When code reviews and testing no longer bottleneck on your three most experienced engineers, those engineers start working on the platform modernization, the claims processing optimization, and the provider network improvements that have been sitting in the backlog for two years.
Your recovery time shrinks. When something does break, AI-assisted monitoring catches it in minutes, not hours. Automated rollback gets you back online before your users notice.
The DORA data backs this up. Top-performing engineering teams (the top 16%) deploy on-demand, multiple times per day. The top 9% have lead times under one hour. These are not startup numbers. These are achievable at enterprise scale with the right practices and the right tools.
Seven Steps to Get Started
You do not need to transform your entire engineering org overnight. Start contained. Here is the sequence that works.
Step 1: Audit Your Processes
Before you touch any AI tools, write down your current processes for code review, testing, deployment, and documentation. If they are not written down, they do not exist in a form that AI can amplify. This is the prerequisite. Skip it and everything else fails.
Step 2: Pick One Team
Choose the team that fits the "Quality-First, Sometimes Too Slow" archetype. Good practices. Not enough people. This team will see results fastest and become your internal case study.
Step 3: Give AI Full Context
Build agent skills for your codebase. Architecture decisions, business rules, data models, constraints. Package everything a senior engineer knows into structured knowledge that your AI tools can access. Without this, you get generic output that needs heavy editing. With this, you get output that fits your system.
Step 4: Automate Code Review First
This is the lowest-risk, highest-impact starting point. Automated first-pass reviews do not change your code. They do not touch production. They just catch issues earlier and free up your senior engineers. Every team we have worked with sees immediate time savings here.
Step 5: Automate Testing Next
Start with API tests. They are the easiest to automate and have the highest coverage impact. Then move to end-to-end tests. Within one sprint, you should have significantly higher test coverage than you had before.
Step 6: Measure What Matters
Track the five DORA metrics: deployment frequency, lead time for changes, change fail rate, recovery time, and rework rate. If these are moving in the right direction, your AI adoption is working. If they are flat, something in your process needs to change before AI can help.
Step 7: Expand to the Next Team
Once your pilot team has results, use them as the proof point for the next team. CTO-to-CTO references work in the market. Team-to-team references work inside your org. Let the results do the selling.
The Question You Should Be Asking
The question is not "should my health insurance engineering team use AI?" That ship has sailed. 90% of your engineers are already using it individually.
The question is: how do you close the gap between individual AI adoption and organizational speed?
That gap is where the 3x lives. Not in the tools. In the way the tools connect to your processes, your architecture, and your team structure.
At Wednesday Solutions, we have helped insurance engineering teams close this gap by starting with a contained engagement: stabilize a platform, modernize a pipeline, automate a testing process. Not an org-wide transformation. A contained problem with measurable results that becomes the proof point for everything that follows. We have a 4.8/5.0 rating on Clutch across 23 reviews, with insurance and financial services companies among our longest-running engagements.
The 3x is real. But it starts with getting the foundations right.
Frequently Asked Questions
How long does it take for a health insurance engineering team to see results from AI adoption?
Teams with codified processes (documented code review standards, testing rubrics, deployment checklists) typically see measurable improvements within one sprint of structured AI adoption. The first wins usually come from automated code review and testing. Teams without codified processes need 4 to 6 weeks to document their standards first. Skipping that step is the most common reason AI adoption fails.
What is the biggest mistake health insurance companies make when adopting AI in engineering?
Rolling it out to the entire org at once. AI adoption works when you start with one team that has good engineering practices but not enough people. That team gets results, becomes the internal proof point, and the approach spreads organically. A top-down mandate to 200 engineers without a pilot team almost always produces frustration and abandoned tools.
Does AI-generated code create security and compliance risks for health insurance companies?
It can, if you adopt AI naively. Generic AI tools produce generic code that does not account for your data handling requirements or security policies. Structured adoption with agent skills means the AI operates within your constraints from the start. Automated code review catches security issues before they reach production. The net effect is usually fewer security gaps, not more, because automated review is more consistent than manual review across high volumes.
What are agent skills and why do they matter for health insurance engineering?
Agent skills are structured knowledge packs that teach AI tools how your specific system works. They contain your architecture decisions, business rules, data models, coding conventions, and constraints. Without them, AI produces generic output that needs heavy editing. With them, every AI tool on your team operates with the same context your best senior engineer has. For health insurance, this means the AI understands claims processing workflows, regulatory data requirements, and provider network structures before it writes a single line of code.
Can AI help with legacy system modernization in health insurance?
Yes, and this is one of the highest-impact use cases. We modernized a legacy codebase with 1,113 directories, 2,355 files, legacy C code, zero documentation, and SOAP APIs. Naive AI produced partial results. Structured AI with agent skills that understood the full codebase delivered in two weeks. The key is giving AI the context it needs to understand the legacy system, not just asking it to write new code.
What DORA metrics should health insurance engineering leaders track to measure AI impact?
The five that matter: deployment frequency, lead time for changes, change fail rate, recovery time, and rework rate. Top-performing teams deploy on-demand multiple times per day with lead times under one hour. Most health insurance teams start far below that. Track these before and after AI adoption. If they are moving in the right direction, your approach is working. If they are flat, the problem is in your processes or team structure, not your tools.
How much does it cost to implement AI across a health insurance engineering team?
The tools themselves are relatively inexpensive. AI coding assistants, automated review tools, and testing platforms combined usually cost less per month than a single contractor. The real investment is in building agent skills for your codebase (typically 1 to 2 weeks of senior engineer time) and running the pilot with one team (one to two sprints). The ROI comes from compressed release cycles, fewer production bugs, and unblocked senior engineers. Most teams see payback within 90 days.
Should we build our own AI tools or use off-the-shelf solutions?
Buy. This is not your core business. Health insurance companies should be building claims processing, provider networks, and member experiences. AI tooling is commodity infrastructure. Use off-the-shelf coding assistants, automated review tools, and testing platforms. The competitive advantage comes from the agent skills you build on top of them, which are specific to your system and your business rules. That is where your investment should go.
Top comments (0)