Vedran Križek

Posted on Oct 9 • Originally published at Medium

AI-assisted software engineering

#webdev #ai #softwareengineering #productivity

Everyone is talking about "vibe coding" and the generative AI in the context of software engineering these days - and rightfully so, as the impact on the industry is nothing short of paradigm shifting, however there is a misconception that generative AI provides non-technical people with "build me this" button and that neither IT experts nor complex development processes are really needed anymore. Nothing can be further from the truth IMO- and in this article, I will try to show how AI augmentation () really **works in real life*, or at least how it works when it comes to software engineering.

(*) I'll be using terms AI-augmented and AI-assisted interchangeably in this text, the reason is that although I DO think the term AI-augmented does (slightly) better job of explaining the concept I'm trying to relay here, it is (at least to me) mouthful to pronounce so I'm using the term "AI-assisted" as a "verbally friendly" alternative

Majority of people in the IT industry that I've discussed this topic with broadly fall in one of two distinct camps (*) - one that partially or fully rejects the generative AI when it comes to software development due to various reasons that usually boil down to known propensity of large language models to hallucinate, and the other who embraces it, accepts it's quirks and as a consequence typically 10 x's their productivity.

The author of this text is (no spoilers there) a member of the latter group and the point of this article is to try to show how the generative AI can augment a real-world software engineering process. The targeted audience is pretty much anyone interested in or curios about vibe coding or AI aided code generation, but I'm especially keen to reach those who are still on the fence about introducing AI augmentation in their organization or trying it in earnest themselves. After reading this, my hope is that they will be able to make a more informed decision on this important subject.

(*) the division between those groups is not as black and white as depicted here as there is non-trivial overlap between the two, but generally speaking, they are distinct from each other in their general attitude towards generative AI.

Introduction

Let's start by defining those two main terms from the article's title. The first one is "software engineering process", which we can define as:

A process of designing, building and ultimately delivering a software solution for a given idea / problem.

That leaves the term "AI augmentation", which, in the context of this text, can be defined as:

Enhancement of a particular process via integration of AI agents/assistants with the goal of significantly increasing its delivery speed while decreasing its resource requirements.

Simple, right? So now, with that sorted out, let's join those terms together and see how AI augmentation of software engineering process would work IRL, and what it is all about.

AI augmentation

If we take a high level look at the software engineering process, we can observe that:

A number of process specific artifacts are created throughout it
The creation of those artifacts requires expert resources.
The majority of the project's overall time and resource budget is spent on creating those artifacts.

All that points to the artifacts as a very enticing integration point for the AI augmentation of software engineering process - so that is exactly where we'll "plug our AI in".

Such artifacts for example include high level project concept, documented functional and technical requirements, UX/UI design proposals, testable product increments, product backlog and it's items, individual value-adding iteration goals, test scenarios and end-to-end test scripts etc. the list goes on.

We can also observe that, besides the fact that creation of mentioned artifacts require experts in various fields (system architects, business analysts/functional designers, UX/UI designers, developers, etc.) it also requires a framework within which those artifacts can be created in a controlled and deterministic fashion and in the environment where all prerequisites for each are satisfied and honored.

For example in order to create a high level project concept, functional or technical solution - the discovery process needs to be performed by appropriate experts - and only after that, the relevant artifacts can be created.

This distinction between the process artifacts and the process framework itself is important one to bear in mind, because it will guide our AI augmentation and manage our expectations for introducing AI into the overall engineering process, in particular:

The role of AI in the AI augmented software engineering process is to increase delivery speed while decreasing resource requirements for the individual artifacts within the software engineering process and NOT to eliminate the need for experts within the process or to replace the process itself.

After all, it is called "AI augmented" and not "AI replaced".

AI augmented software engineering

Any engineering process can generally speaking be broken down into three distinct phases - design, implementation and testing/delivery.

Note that those three phases can overlap and can be performed either for the entire project scope (a.k.a. "waterfall") or in small, incremental "value adding" chunks - which is typically what agile/scrum is all about. The point here is that those three phases are ubiquitous regardless of the methodology used.

In this chapter, we'll go over each of those phases and list important artifacts produced within it, how they can be AI augmented, which domain experts are required for each (*), as well as what time and resource savings can be expected from the augmentation itself. We will not go into deep details as the goal here is to give the reader the general idea how AI augmentation works in real world and not to be a comprehensive or definitive resource on the subject.

(*) The role of domain experts is the same for every artifact - namely to Review, Complete and Integrate the work of the AI agent/model (often in multiple iterations) and therefore whenever a certain domain expert is mentioned in the remainder of the document, it is implied that their role is the one I've just described.

Phase 1: Design

Design process consists of three distinct components (functional, visual and technical), each with it's own artifacts which can be significantly augmented with AI, for example:

Artifact: Project concept

Augmentation technique/approach: The draft of the problem/idea description, including the project context and organizational requirements, known risks and mitigation approaches (a.k.a. project concept) can be quickly summarized and created by an LLM based on transcript of the discovery session/interview, meeting notes etc. Furthermore, the LLM can be provided with additional instructions to make sure to avoid a set of solutions/approaches or exclusively use some others - as dictated by your organization's standards and tech stack preferences.
Required domain expert: Business analyst/Functional designer (or Technical Product Owner)
Performance increase: In my personal experience - the required time and resources for completion of such artifact is under one hour - where without AI augmentation it would typically take 2–5 hours.
Typical tools: any modern AI assistant with RAG capabilities such as ChatGPT, Anthropic's Claude, Deepseek V3, etc.

Artifact: Functional solution/requirements

Augmentation technique/approach: When writing the functional requirements, the AI assitant can provide a first draft (and review it in the context of CRUD method for functional completeness - or any other verification methods that your organization uses) of the actors, actor goal lists and associated acceptance criteria, use cases, business rules, etc. - based on the concept document and any supplemental information available at the time. That draft is then typically iteratively expanded until it includes the entire functional scope of the system under design. Additionally, the assistant/model can be made to follow your organization's internal standards and let's say produce use cases following a particular style or write acceptance criteria using Gherkin etc.
Required domain expert: Business analyst/Functional designer (or Technical Product Owner)
Performance increase: In my personal experience a fully fledged functional solution document - a.k.a. Functional requirements (including diagrams where needed) for a middle sized project (ie. a backend with some non-trivial business logic and admin UI, some integrations with internal and 3rd party APIs and a fully responsive web client) that would previously take around 20–30 hours, with AI augmentation typically takes no more than a 5–10 hours. (not counting client reviews and in-document communication in both cases)
Typical tools: any modern AI assistant with RAG capabilities such as ChatGPT, Anthropic's Claude, Deepseek V3, etc.

Artifact: Technical solution/requirements

Augmentation technique/approach: When writing the technical solution document (a.k.a. Tech. requirements) the same applies as mentioned above for functional solution- ie. AI assistant/agent, directed by a Tech lead (a.k.a. System architect, etc.) can generate system architecture and tech stack proposal which is in sync with your organization's internal standards and then that proposal can be further refined until it is ready for client review
Required domain expert: Tech lead (a.k.a. system architect, etc.)
Performance increase: Roughly the same as for functional requirements
Typical tools: Same as for functional requirements

Artifact: Flow/sequence/architecture diagrams

Augmentation technique/approach: When creating diagrams in particular, AI can provide you with an "easy button" if you use code-to-diagram tools such as mermaidJS or similar (and you really should) - as LLMs are very capable of generating the diagram code from your use cases, usage scenarios or other comparable formats.
Required domain expert: Business analyst, Functional designer or a tech lead (depending on the domain the diagrams are created for)
Performance increase: typical set of a couple of diagrams required for an average functional or technical requirements document can be generated and verified within 15–30 minutes, where using classic "handmade" approach (given the expertise in writing MermaidJS code) would typically take 1–3 hours.
Typical tools: Specialized AI code generation/pair programming tools such as Cursor, Aider or similar, or any modern AI assistant with strong code generation capabilities (at the time of writing this article the good examples of such assistants are Anthropic Claude 3.7, Google's Gemini 2.5 pro, DeepSeek V3/R1, etc.)

Artifact: "Function-less" UX/UI prototype

When it comes to UX/UI design - the usage of classic AI assistants is not as straightforward as it is in the case of functional and technical solutions. Therefore that part of the design process still relies heavily on UX/UI design experts and their creativity and vision. However, that is true mostly when it comes to original and inventive design rather than the "derivative" one.

The term derivative here does not mean sloppy or bad. Think about all the login screens and contact forms you've seen in your life, or CRUD admin listview/editview pairs or a blog/event detailview pages, or even entire run-of-the-mill product homepages- and you'll find that they are all essentially the same thing with different dressings. Each instance is to significant degree a derivation of other of the same class.

This is not to say that UX/UI designer's work is meaningless and that there is no place for personal UX/UI designer's expert touches that make one design different from another - and that is exactly where UX/UI designer comes in AI augmented UX/UI design - to provide that personal touch, that crucial 20 in 80/20 rule, in other words to make a difference with their creativity rather than spend hours and hours drawing yet another login or contact form for all possible screen breakpoints.

Augmentation technique/approach: In cases where we derivative design is sufficient for our project's needs, the practice that works very well is the one where you choose a CSS/UI library (such as materialUI or tailwindCSS, etc.) and then instruct AI assistant/agent to directly code the UI that you want (somewhat similarly to using AI assistant to create diagrams via mermaidJS) using that library and following good UX/UI design practices. The result is typically quite decent right from the bat and can then be easily tweaked further by an UX/UI expert that know how to work with CSS. Although this technique blurs the line between design and implementation somewhat, it is very useful and circumvents the need for a lot of "generic" UX/UI design proposal - review - integration iterations.
Required expert(s): A frontend specialist and a UX/UI specialist - as this artifact typically requires more iterations and tweaks than a simple product increment (see the next section) due to it's visual nature and aesthetic purpose.
Performance increase: A derivative design for a typical small business/service site which is fully responsive takes about a day or two to both create and code a "function-less" responsive UI. That is the amount of time required to go from having nothing but a written concept to a professional looking UI accessible on a testing/dev server and ready for a client review.
Typical tools: Specialized AI code generation/pair programming tools such as Cursor, Aider or similar, or any modern AI assistant with strong code generation capabilities (at the time of writing this article the good examples of such assistants are Anthropic Claude 3.7, Google's Gemini 2.5 pro, DeepSeek V3/R1, etc.)

Phase 2: Implementation

The resource savings here are immense due to the fact that a group of developers is in effect replaced by an AI agent and one (typically full-stack) expert (or more if domain dictates it - but we're talking 2–3 here, mostly part time and not 5–10 FTE).

Here are some of the most important artifacts produced within this phase:

Artifact: Product increment

Augmentation technique/approach: When building the product increment - all code (both frontend and backend) can be written by an AI assistant or agent based on the well defined and self-contained functional and technical requirements with clear and unambiguous acceptance criteria. The AI assistant can easily be made to follow internal technical and coding standards, use particular libraries and frameworks, use only particular versions of said libraries and frameworks etc. The results is then reviewed and merged by the project tech lead (or a comparable domain expert assigned with running the AI agent) in exactly the same way as it would be done for any code submitted by a human developer. The quality of requirements makes a significant difference here - so make sure that your requirements writing and scope partitioning game is up to task if you want to get the most out of AI augmented creation of this artifact.
Performance increase: In my experience a low-to-mid complexity project (ie. a backend with admin UI, some business logic and API integrations and a good looking, responsive client web UI) requires about 5–10 mandays to fully build from scratch - including a full UI (see the chapter above) and a full test harness. For comparison, in most organizations I've worked with - this would typically be a 1–3 month project for a team of 5+ people (architects, backend devs, frontend devs, UX/UI designers, business analysts, QA specialists, etc.) so we're talking about 20–50x improvement here.
Required expert(s): Tech lead assisted by additional specialists (ie. security specialist, performance specialist, etc.) where and when needed
Typical tools: Specialized AI code generation/pair programming tools such as Cursor, Aider or similar, or any modern AI assistant with strong code generation capabilities (at the time of writing this article the good examples of such assistants are Anthropic Claude 3.7, Google's Gemini 2.5 pro, DeepSeek V3/R1, etc.)

Artifact: Unit/Integration test harness

Augmentation technique/approach: When writing tests - all test code can be first written by an AI agent/LLM and then reviewed and integrated by project tech lead. The tests are typically written either based on requirements and the existing functional code created previously or generated together with the matching functional code (see product increment above) based on just the requirements.
Performance increase: the time reduction here is similar to the previously mentioned case, however, the main upside here is that IRL tests are often omitted due to "we don't have time now, we have to ship, we'll write tests later" fallacy - and that excuse stops being viable with AI augmentation. Also, having a good and comprehensive test harness goes a long way in helping the expert running the AI agent to quickly and with confidence review the delivered code - which is crucial for the next engineering phase.
Required expert(s): Automated testing specialist, QA specialist
Typical tools: Specialized AI code generation/pair programming tools such as Cursor, Aider or similar, or any modern AI assistant with strong code generation capabilities (at the time of writing this article the good examples of such assistants are Anthropic Claude 3.7, Google's Gemini 2.5 pro, DeepSeek V3/R1, etc.)

Phase 3: Testing and delivery

We've mentioned some of the artifacts of the testing and delivery phase when we talked about test harness before, but we'll cover a few more here:

Artifact: End 2 end / UI test suite

Augmentation technique/approach: The act of testing can be significantly AI augmented - especially the process of acceptance testing (confirming that the product increment indeed satisfies all acceptance criteria that define it's solution space) which can (and should) be fully automated anyway. AI augmentation here typically boils down to creating test scripts, test scenarios and end 2 end tests that are then reviewed, completed and ran by the QA/testing expert.
Performance increase: The time savings here are comparable to that of implementation phase.
Required expert(s): QA/testing expert
Typical tools: Specialized AI code generation/pair programming tools such as Cursor, Aider or similar, or any modern AI assistant with strong code generation capabilities (at the time of writing this article the good examples of such assistants are Anthropic Claude 3.7, Google's Gemini 2.5 pro, DeepSeek V3/R1, etc.)

Artifact: Development and testing/staging environments

Augmentation technique/approach: Setting up a staging server and deploying product increment for the demo can easily be AI augmented by having an AI create development docker setup, setup scripts for staging/test server, deployment scripts/configurations etc. which are then reviewed, completed and integrated by the DevOps expert.
Performance increase: The time saving here is significant, especially for standardized environments (you really should have internal standards in your organization - as you see, they come very handy when it comes to AI augmentation) - and furthermore a lot of time is saved due to the fact that most of the "gruntwork" performed by usually very scarce and overbooked (hence slow) DevOps resources - is taken off of their hands by AI agents and so those resources can then focus on complex DevOps tasks rather than mundane ones that often take most of their workday.
Required expert(s): QA/testing expert
Typical tools: Specialized AI code generation/pair programming tools such as Cursor, Aider or similar, or any modern AI assistant with strong code generation capabilities (at the time of writing this article the good examples of such assistants are Anthropic Claude 3.7, Google's Gemini 2.5 pro, DeepSeek V3/R1, etc.)

Summary

Note that in all of the examples above - there are two things that are constant

AI augmentation provides significant-to-easy-button-level delivery improvements for vast majority of the software engineering process
Every single artifact created by AI agent/assistant invariably has to be reviewed, completed and integrated by a human domain expert - even if that means just "helicoptering" over the AI agent while it does it's thing.

What it boils down to is this:

AI augmentation works well and produces quality alongside the speed ONLY if it is used to build artifacts that the organization's domain experts would be able to build themselves - given enough time.

Or in other words

Any deliverable MUST BE OWNED and "quality approved" by a human domain expert - even if it was built 100% with AI (otherwise you'll loose control of the quality and end up with AI slop)

But what about hallucinations?

Now that we talked about how to AI augment your development process, we need to address the elephant in the room, those dreaded "hallucinations"

Let's start with acknowledging that hallucinations are indeed real, that they DO happen and furthermore, that they are here to stay, due to the very nature of the LLMs themselves - as they are in their core stochastic systems - meaning that for the same inputs they will by design provide somewhat different outputs.

That said however, hallucinations are

Much less of the problem than most people think due to the improvements in the models themselves - new models doesn't hallucinate nearly as much as older (GPT-3.5 etc.) ones - they still do, but much less, and (more importantly),
Hallucinations can be comparatively easiy spotted and compensated for by proper prompting techniques performed by competent domain experts

The key here is that second point - human expert in the loop approach.

With a strong project framework, internal standards and competent domain experts, the hallucinations simply stop to be a realistic problem, becoming nothing more than a nuisance.

Now that we have decent idea of where and how the AI can be used to augment and improve the software development/engineering process, let's discuss that "strong project framework" requirement from the snippet above. In particular let's discuss one practical framework that I've found to work really well in AI augmented development scenario.

Applicable engineering framework - "micro scrum"

I call this concept "micro scrum" and it is based on the "engineering based scrum" which I've written about here and here. In essence micro scrum is as it name suggest, a resource-light version of scrum framework where the only required roles throughout the process are technical product owner and tech lead - with associated "part-time" roles of UX/UI designer and individual domain specialists such as security experts, QA and testing specialists, etc. that join the core team intermittently when the particular project increment needs their expertise.

The process itself is mostly unchanged from what is described here , the main difference being that the typical team is much smaller - it extreme cases it can even be "scrum of one", but in the typical scenario it is composed of at least two permanent members - technical product owner and a tech lead with a couple of non-permanent (but very important) roles, lead by the role of the UX/UI designer. The added "freebie" of this setup is that due to it's size (and the implicit requirement for all permanent members to be mature experts), achieving self-organization is much easier and more feasible than for your classic scrum team composed of various individual with varying levels of expertise and social skills.

The events (especially their expected outcomes/goals ) and the framework remain unchanged except for:

In case of the minimal team, there is no need for dedicated scrum master anymore - the responsibilities of that role can be taken by technical product owner. That does not mean that you cannot still have scrum master if you like/need, just that in the case of the minimal team it is no longer mandatory.
Effort estimating can be done WITH the help of an AI assistant/agent - acting as a separate expert, but ANY proposed estimate coming from it must explicitly be approved by both the tech product owner and tech lead

That is more or less that - no further changes are needed really, as the goal and the purpose of each individual event and the iterative process in itself remain unchanged from the base process mentioned before. We're still producing the software using the best know approach for doing so, just that most of the "heavy lifting" or "grunt/grind" team resources are replaced by the AI agents/assistants

Conclusion

The effects of AI augmentation of the classic/legacy software engineering process are (in my opinion at least) both groundbreaking and industry-changing as the theoretical reduction in the amount of needed "general coder" resources is immense.

As a consequence, a good AI augmented team can typically deliver 2-5 times (or even more - depends on the domain) value in the same time frame as the classic one, and do so with a fraction of resources. This affects the price of creating software solutions by lowering the internal development costs drastically, and as a consequence "democratizes" software solutions by making them available to individuals and organizations who might have had good product ideas, but lacked funds for (let's be frank about it - expensive) design and development.

Those are all positives in my view.

When it comes to challenges - here are the few that I didn't have time address in more detail due to size constraints of this (already too long) article:

The AI assistants are biased by the data they were trained on, that can lead to them pushing obsolete practices or library versions in their responses unless guided explicitly (that is where things like Cursor rules come in handy)
With generative AI, the IT security aspect of your project is easy to get to the point where it "looks fine" while it is in fact incomplete or insufficient. IT security is one area where you truly need to have strong IT security expert/practices included in the "AI loop" - as the potential exploit impact is significantly larger than for other aspects of your system.
One of the most alluring features of LLMs and AI agents in the context of software development is that they are deceptively easy to try naively and get results that look decent, while at the same time, it requires significant expertise to ensure that the quality of those results is not compromised.
It requires significant expertise to integrate AI augmentation in a scalable and reliable way into an existing processes your organization might have in place - as those processes typically needs to be optimized and perform well BEFORE you try to introduce AI into them
Like any other skill, your development skill will atrophy if not used - so it requires self discipline to stay in "coding shape" - which is something you need to be in if your role is to review, complete and integrate generative AIs output into non-trivial production systems
The experts I'm mentioning in this article do not come "dime a dozen" and are instead typically hard do source as they need to posses much more than mere technical ability (anyone who has ever worked in IT resource hiring will know what I'm talking about here)

So to sum up this very long article into some take-away points:

AI augmentation is not a silver bullet or a magic "build me a software solution" button that can be pressed just as well by a seasoned IT expert as their uncle or a random "man on the street".
AI augmentation DOES NOT eliminate the need for experts, but it does significantly reduce the need for "generic developers" and the number of individual experts you typically need for any given field. There are parts of the holistic engineering process that require humans (ie. discovery, reviews, demos, etc.) and are therefore not suitable for AI augmentation
AI augmentation DOES NOT replace the engineering framework with a magic "design&build" button but rather vastly improves the delivery of artifacts WITHIN that engineering framework
When properly integrated - AI augmentation DOES offer game-changing productivity boosts - on the magnitude of ten or more for most domains - web development especially.
If you struggle with delivery of your "legacy" engineering process, just "slapping some AI" on it will make it fail more, the PREREQUISITE for reliable and effective AI augmentation of your process of choice is that that particular process is implemented correctly in it's "non-augmented" form. Or put another way - your legacy process should deliver already - and then and only then the AI augmentation will make it deliver much more.
Any deliverable MUST BE OWNED and "quality approved" by a human domain expert - even if it was built 100% with AI (otherwise you'll loose control of the quality and end up with AI slop)

That is about it, in this article, we've only scratched the surface, and since generative AI landscape develops at such breakneck speed, it is debatable what and where the surface actually even is.

With all that said, my experience with AI augmented software development is extremely positive, I see it as phase transition from handcrafted to "machine manufactured" software engineering and if you want to try it in your organization, I strongly encourage you to do so. If you need help with it - let me know in the comments and I will most likely be able to help you out.

DEV Community