Kazuya

Posted on Dec 8, 2025

AWS re:Invent 2025 - Driving Profitable Growth with Generative AI: From Prompt to Product (ISV316)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - Driving Profitable Growth with Generative AI: From Prompt to Product (ISV316)

In this video, AWS Solutions Architects present six design strategies for building profitable generative AI products, addressing why many AI investments fail to reach production. Using a fictional healthcare software company as a case study, they demonstrate the importance of starting with customer value rather than technology. The strategies include: adopting a value-driven approach with early pricing hypotheses, focusing on tasks over conversations, narrowly scoping tasks for better accuracy, matching task modality (synchronous vs. asynchronous) to requirements, favoring augmentation over full automation with human-in-the-loop workflows, and designing for both human UX and LLM experience (LLMX). Real-world examples from companies like Dovetail, Canva, Kindle Direct Publishing, Tapi, and Vital illustrate successful implementations. The presentation emphasizes profitable growth through strategic model selection, token optimization, outcome-based pricing, and comprehensive monitoring at the task level.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

The Gen AI Investment Gap: Why Most Projects Fail to Reach Production

Good morning everyone. Thanks for making time on a Thursday, especially during lunchtime after a long week at re:Invent. We're really glad you're here. Just a quick show of hands, also proof of life. Has anyone heard about Gen AI this week here at re:Invent? Yeah, so you're definitely not alone. There's been over $200 billion invested in Gen AI startups. But the sad truth is that a lot of those investments aren't actually making their way into production.

So I'm Erik Anderson. I'm a Principal Solutions Architect here at AWS, and I help some of our most strategic customers with their AI strategy, and I'm joined here by my colleagues. Hey everyone, I'm Nicola. I'm a Startup Solutions Architect. I work with healthcare and life science startups out of Australia and New Zealand. And I'm Jeffrey Hammond. I am an ISV Product Strategist. I work with our software company customers to help identify how AWS can help you be more profitable and drive profitable growth.

So over the next 60 minutes, the three of us will go through six proven design strategies that will help make sure that you're not on the wrong side of that statistic. But first, in order to help us along, we've got a fictional software company that's in the healthcare space that we'll be referring to throughout the presentation. So picture this, a successful B2B customer that was designed to handle everything for healthcare companies, from scheduling to doctor's notes, all the things like that, and their CEO woke up one morning and said, hey, our competitors have AI in our products, we should do that too. So he went to his technical team and said, I'll give you three months to build something with AI.

So what did our well-meaning but ill-advised team do? They built a chatbot, of course, because who doesn't need something to go out and answer questions. The only problem was the chatbot started getting creative with appointment scheduling, and so it created appointments that didn't even exist. And I don't know about you, I'd be a little upset if I appeared at a doctor's appointment on a Saturday and the office was closed, so it wasn't really the ideal situation. But it actually gets better or worse depending on your perspective.

So they kind of forgot to monitor any of the costs, and it actually was costing them $50 to provide this capability, and they weren't charging their customers at all in order to do so. Fortunately, this is just a hypothetical situation and no one actually showed up to the doctor's office on a Saturday, but some of the lessons that this company can learn, I think you can take back to yours as well as you go back to your companies after re:Invent.

But whether you're a data scientist, a developer, you focus on product strategy, or the technical details, we really want you all to come out of here with something that you can take back, and it really does take all those roles in order to make sure that you're successful. You need the right use cases, the right technical details, the right pricing strategy, which we'll get into in a lot more detail. And with that, I'll hand it over to Jeffrey to go into a bit more detail.

Survey Insights Reveal Misaligned Priorities and High Abandonment Rates

Yeah, so as you mentioned, Erik, the spend is pretty significant in this space. $192 billion has been poured into AI startups by investors so far in 2025, and we anticipate that by 2029, the software spend is going to grow to over $114 billion. We did a survey earlier this year with Forrester Research, and we went out and asked over 640 software companies about their spend plans with respect to generative AI, and what we found was that over 60% have significantly increased their R&D budget. So the money is there, it's being spent. I think we all know that.

But here's where it starts to worry us a little bit. In that same survey, one of the things that Forrester asked was, well, what are your priorities? Why are you spending this money? What do you hope to accomplish? Top answers were things like, well, we want to increase our brand awareness. CEO wants to be thought of as an AI-native company, growing revenue, which is good on its face, reducing costs at the bottom of the list. And there are about 10 or 11 answers were things like we want to improve our customer support or we want to enhance our user experience.

That's a little bit of an alarm bell for me because if you think of what we should be doing with these technologies, if we're not improving the way that our customers use our product, the way that we're reducing their toil, then in some ways I think we're putting the cart before the horse. And so when we think about how to be successful when you're going from prompt to product, putting that customer experience front and center, working backward from the customer is actually the thing that should be the highest priority as you start to think about your opportunities.

I think we see the results in some of the latest surveys and information that have come out. According to 451 Research, they did a survey of about 1,000 IT and line of business professionals, and one of the things that they found was 42% of those companies had abandoned the majority of their AI initiatives before they reached production. At the same time, 46% of

those projects have been scrapped between the proof of concept phase and scaling broad adoption. Now, if I were to take that stat and be the glass half full person, that's actually a good thing because you don't want to go all the way to scaling and roll out in full production to discover that you can't drive profitable growth. So it's good that we're weeding out things before we go the whole way through the effort.

To me, the glass half empty here though is that we're putting things into POC which should never get out of POC, and we should be finding more effective ways to separate the high value business cases and use cases that we're going to work on from the marginal value use cases so that we can put all of our focus and all of our spend behind the things that have the highest degree of success and the highest ability to drive profitable growth. So how do we do that?

Core Strengths of Generative AI and Six Product Design Strategies

How does generative AI create business value? Nicola, I'm going to turn that over to you. Thanks Jeffrey, so what are some of those really core strengths of generative AI that you can really use to generate business value? Generating content, of course it's in the name, probably the most obvious. You know, think of images for retail websites, hyper personalized marketing. I know myself, I've used some to craft some of those trickier customer responses, so you're probably already using it yourselves either in a personal or professional setting in this way.

Code generation and execution is another really core strength. There are many applications for this, so I think to our first example of creating appointments and booking these to generating SQL for business intelligence use cases. Code generation and execution with secure guardrails is a really, really important strength. Lighting up dark data within your organization similarly is something we're seeing lots of customers doing. Whether you've got information in emails, in sales calls, across some legacy systems within your organization, being able to extract information from those data sources that was previously inaccessible or invisible.

Workflow augmentation to actually reduce some of that heavy lift is another critical area where we're seeing generative AI really shine. You can create, you know, a collaborative loop where AI executes on a task and humans provide oversight, validation, and actually input suggestions back into your task in order to improve that over time. And finally, generative AI can be used to reduce the cost to serve through strategic automation where you have really well defined and understood workflows.

So based on our work with customers and where these core strengths that we're seeing, we've actually devised six product design strategies that we're going to detail today to you. So with that I'll pass back to Eric. Thanks Nicola. So as Nicola mentioned, we're going to transition into these six strategies in much more detail, and I'd like to point out that profitable growth was so important to what we're talking about today that it actually made the session title.

Design Strategy Zero: Value-Driven Approach Over Technology-Driven Development

This isn't just about building cool AI features, it's about building things that will actually help you make money as customers. And before we dive into any technology decisions, before we talk about tokens or vector database optimizations, we need to think about this fundamental question. Will and how will this AI capability actually drive profitable growth for you as an organization?

Because here's the thing, and we saw this with our fictional healthcare company, they learned it the hard way. They built something, but it wasn't the right thing. You might have an impressive AI feature, you might have a not impressive AI feature if you're having people show up on a weekend. But if users don't care about what you're building, you're probably not building a sustainable business model.

I know that sounds obvious, but many teams just kind of forget about that and get caught up in all the hype of building something cool, but ultimately you need to make sure that you're building something that allows you to stay in business over the long haul. And a lot of times this is what we call the technology driven approach that we'll talk about more on the next slide. But the solution really starts right here. How can you build something that starts from customer value, and then build that back into every AI product decision?

So what does a technology driven approach look like? It typically starts with this. People follow a relatively typical software design process, typically with a POC. Maybe they come up with the use case first, maybe they bring it up after the fact, then they go through the subsequent design phases to design, build, test, and deploy the application. And at the very end you see over there on the right side, that's when they start thinking about price.

And what ends up happening is you start hearing these from your customers: the ROI isn't there, or it was a great demo but I'm just not willing to pay for it, or maybe I like it but I just can't justify the cost. So what do teams do? Well, maybe they release it as a premium feature, hoping that somehow giving it away will ultimately create a sustainable business model. And that's exactly what happened to our healthcare company. They built something, they released it, but ultimately their CEO was happy because they had AI in their product, but it was the wrong thing for their customers. So this backwards approach where we're building first and figuring out how to price and how to design value at the very end of the process is kind of backwards.

So let's talk about a little bit of a better way. While many of the core components of a value-driven approach are the same, this is fundamentally different. Instead of building and then figuring out the pricing later, we're going to start here by identifying high value use cases. Even as a techie myself, I know that it's really important to focus not just on the technology, but really what is going to give my customer value. This is going to deliver a meaningful outcome for my customer and make sure that there's something they're actually going to pay for.

This working backwards from customer requirements is something that's near and dear to our hearts here at Amazon. And what you'll notice is that early on in the process here, we're going to develop a pricing hypothesis. And you can ask yourself at this point, are customers willing to pay for this feature? And that's not to say that you need to charge for every single AI feature, and we'll talk more about that in the next slide. But you do really want to start with this early in the process. We're shifting left this value proposition. And then you can go ahead and still design, build, and test and deploy your application, but you're going to be making sure at this point that you're doing this on the right thing.

I'd also like to call out the feedback loop at the end, and it's important that this is not only a technology feedback loop, but it's also related to the value that we're delivering for our customers. We're making sure that as we're going through, we're getting information from these customers to see that we're getting the value that they expect, and then we can adjust along the way. If our healthcare company had followed the same example, maybe they would have picked something else to develop. Maybe they'd have done something like automated insurance claim processing or clinical note summarization. But having this value-driven mindset is really critical throughout the entire process, and it's going to affect more than just the technology decision. It's also going to affect your product and go-to-market strategies, which is what I'll talk about here on the next slide.

Four Go-to-Market Strategies: Embed, Extend, Expand, and Build

So when it comes to your go-to-market strategy, you have four main paths forward that we've identified here. Embed is the first one. So if you think about this, it's a defensive play. You're adding existing AI features for your current customers, and typically you're doing so without additional charges. This is often about protecting market share and staying competitive in the marketplace, and realistically this is what we see a lot of our customers do today.

The one competitive differentiator you have here though is your data. And if you imagine, if you're an ISV that's been operating over several years, you have all this additional customer data that you can now go and mine to make your product even more competitive. Think about a CRM provider that previously had been working with financial analysts, but now you can reach out to business analysts and maybe interact with natural language processing. Or maybe you're an HCM provider and rather than having people create job descriptions from scratch, you now can give them some Gen AI capabilities, not charge for it, but help with automated creation of those opportunities. The challenge here though is that you're giving them away, and that's where we get into our next strategy around extend.

So extend is where the new money starts. Effectively, you're adding some additional premium features, but now you're doing so with an ability to charge. So if you think about maybe a collaboration suite, rather than just sending messages, maybe now you have a new AI capability where you can go and summarize details or get updates when you come back from vacation and be able to catch up a little bit more quickly. But in this case, we're now adding a new revenue stream and that's why we're calling it extend. But all these focus on essentially new features within our existing product.

When we start on the new product side, we're going to refer to that as expand. So this is a new market segment, a new product for existing customers. So if you think about my example of creating new job opportunities for a hiring manager, that was really part of the core product. But what if instead I could create a new set of capabilities or a new product to go out and actually screen candidates? So I use AI to go out and do some of the initial processing. I've got a new value proposition. I'm able to now have a new charging model for that without affecting the base model. And because of that, again, I'm able to unlock additional revenue for myself as a company and then give my customers additional value.

Then the last one we have is build. So this is new products for new customers, and this could be entering a totally new space.

A lot of times this has the biggest risk because you as an ISV might be creating something brand new, but it also can have the best reward because now you're able to take some of the learnings that you had in the past, create a brand new offering, and then take that to market with your customers.

Pricing Models for AI Features: From Seat-Based to Outcome-Based

But the interesting thing that stands out to me as we look through all of these strategies is that regardless of what you're choosing, there's going to be that interaction between go-to-market pricing and your product strategy. And so because of that, we're also seeing several new pricing mechanisms that have come out in the process. So there are a lot more beyond just these three strategies, and if you haven't seen it out there, I'll give a shameless plug for our white paper where we're able to go into these in a lot more detail. But these are some of the three most common ones that we're seeing out here in the industry.

Seat-based is likely what you're most familiar with. It's what a lot of ISV customers use today, and here we have a fixed cost per user. There's some real positives to that because we have a consistent price that we're charging our customers. Our customers can plan for it, so there's a lot of consistency, but the problem that we have here is it doesn't reflect the varying value that we're able to give to our customers. So if you imagine a power user is getting a whole lot of value out of your AI features and someone who is maybe against AI or just doesn't know how to use it, they might not be getting any value or as much value, and we're charging people the same amount. And so not only does that mean we're not getting the full value from that, it costs us a whole lot more to handle that power user than it does someone that's a very light user. And so there's a little bit of a difference from an economics perspective as well.

And that's where something like usage or token-based pricing comes in. This helps you as a customer align your consumption from your users to your actual costs. And so from a unit economics perspective, you're now much more in line with how you're actually accruing costs and giving value to users. The downside there obviously is that there's a lot more variability, and if your customers aren't used to buying this way, that can lead to a little bit of a transition as you make that change.

The last one I'd like to touch on is outcome or value-based pricing, and a lot of times we'll talk about these differently, but for the sake of conversation today, I'll include them together. But here we're starting to see a lot more adoption in the space, especially from a Generative AI capabilities perspective. Here you get paid based on the business value that you're actually able to provide to your customers. So imagine that I'm a customer service software company and I'm able to solve tasks for you. So a customer puts in a support case. If I'm able to solve it, you pay for it. It's typically a fixed price for that action, so maybe one dollar per case that I'm solving. If I solve it, you pay. If I don't solve it, you're not going to be charged. And so there's a really nice alignment to customer value and the actual pricing model.

Value-based is a little different in that the change there is that it's typically more variable. So rather than just saying one dollar per action, I'm going to take a portion of savings or portion of revenue. The example I like to think about is I live in Texas. We have property tax valuation every year, and based on how much my property taxes go up and down, that changes the amount I pay. And so there's a company that works with me to try and make sure that that value isn't too high. If they save me money, they take a portion of it, but our interests are really well aligned, and I think that's a really nice example of value-based pricing.

But we talked about working backwards, and so I don't just want to talk about what we see from customers, but what are customers themselves actually saying. So there was a new study from IDC that went out there and asked customers, what do you actually prefer when you're buying AI capabilities? And the way this study worked, as you can see, they asked for their top three preferences, and it resulted in a whole lot of interesting data. We're not going to get into all those nuances, but what I do want to show is some of the three, or the four more prevalent ones. So the top two pricing models were bundled premium tier subscriptions and consumption-based pricing. The next two favorites were hybrid subscription and usage models and flex credits.

But you see across that whole stack, there's not really one that stands out. And so despite there being lots of different options, depending on how you're providing those capabilities for customers, depending on the value you're providing, there's not going to be a one-size-fits-all approach. And so you're probably going to need to have some different monetization techniques as you go along and customers kind of make this transition with you.

Cost Optimization Strategies: Model Selection, Token Usage, and Vector Databases

But regardless of pricing, you also need to keep in mind some of the cost optimization capabilities, because obviously if it costs you too much to provide this to a customer, that's also not going to be sustainable and profitable over the long term. So what can we do from a cost optimization perspective? Again, this is a small subset of the many things that you can do, but a few things that I did want to call out, the biggest option you have likely with Generative AI is your model selection and customization.

So it maybe goes without saying, but don't just default to the latest and greatest model because it is the latest and greatest. It may also be the most expensive. Instead, you should pick the optimal model for your specific use case. Do validation with high quality data, and then customize it to your business-specific requirements. Oftentimes we find that a fine-tuned smaller model or a distilled model can perform just as well at a fraction of the cost, and that's probably going to be your biggest cost lever.

Also keep in mind token usage optimization, and as you scale, these can create big savings. So analyze your token patterns, implement token caching, and ultimately that can help you not only improve performance but keep your costs in check. I'll talk a little bit more about the appropriate pricing plan when we get to the third design strategy, but I do want to mention vector database optimization as the last piece that we'll talk about here. As your data grows, this can be incredibly critical to make sure that you're not overspending from this perspective, and you want to implement efficient chunking strategies as you go along, and especially as your vector database usage improves. And remember, every dollar that you save is either something that you can keep in your pocket or pass back to your customers, so don't overlook this as something to keep in mind.

Design Strategy One: Focus on Tasks Over Conversations

But with that, I'll hand it over to Jeffrey. Thanks, Eric. So design strategy zero, critically important, a business-focused approach and one that understands how you're going to drive profitable growth right from the start. As we get into design, then, the next strategy that we find that successful companies use is focusing on tasks over conversations. In a sense, not just going and building that chatbot because it seems like the right thing to do. While conversational interfaces have their place in certain scenarios, task-oriented designs generally create more valuable, trustworthy, and successful generative AI features.

That's because they help customers, users of your product, focus on specific goals, and they also have clear inputs and outputs which increases your chance of designing a successful use case. They also define clear boundaries for your interactions with AI, and they, as part of that, create natural guardrails around the experience that you're looking to create. These guardrails help your product teams address inconsistent or potentially harmful outputs in the process. A task-oriented approach can often turn out to be the most cost-effective because of the constraints that are involved with the task-based design.

It's also easier to measure success and iterate on your features when you take that task-based approach, and it makes it easier to implement outcome-based billing if you want to implement some of those newer pricing models. If you have a clear definition of what the success of the task is, then it's very natural to be able to charge for it. For example, processing a payment. The payment either goes through or the payment doesn't go through, and you can charge on the agreed success of that. Also, a task-focused approach makes it easier to extend existing customer experiences because they are often task-based interfaces. So if you're using that first strategy where you have an existing product and you're looking to build new features for current customers, extending the user interface with those tasks can often be very effective.

So if you are adopting a task-based approach to the use case that you want to implement, some things that you want to focus on: facilitating the customer, the user's agency and control. You don't want to give them the blank screen. What should I write? How should I talk to this thing? You want examples of things that the user can do, highlight the insight into the dark data that you're about to light up. If you can personalize or contextualize the task, if your use case can begin to predict what the customer might want to do, the intents that they want to fulfill, like for example, if I'm booking a flight, then can the system remember that I happen to be a frequent flyer on American and that I always fly out of Charlotte and bring that context to the beginning of the task processing event.

Finally, how are you going to instrument the task to make sure it's observable, to be able to understand the level of successful completion rates? So just as an example of some of these things, I worked with an ISV customer in the healthcare space that makes information processing systems, laboratory information management systems. One of the things that they did was to take an existing task that a customer has, reading lab reports. Everybody knows, you know, figures come up, you look at it and you understand it and you interpret the results. They didn't want to change that fundamental task, but what they did want to do is they wanted to provide guidance by an LLM to indicate the things that the test reader should potentially pay additional attention to.

So by providing that additional capability by being subtle in the way that they presented the information, they didn't overwhelm the existing user agency, the existing process that the lab tech would go through, but they did allow them to sharpen their focus on the things that might matter the most to them. The result: they saved an average of 2 to 3 minutes per test read, over hundreds of test reads a day, a significant productivity improvement because they focused on the task to start.

So some more examples of this. A great example from a company called Dovetail. We've got a few Dovetail folks in the audience today. You guys put your hands up, yeah, excellent. Dovetail is a customer intelligence platform. It aggregates, analyzes, and shares qualitative data from diverse sources, including customer interviews and feedback, support tickets, and sales calls. This example focuses on a specific task that the Dovetail team helps their customers with, which is theme categorization across a broad variety of data.

Now, they do this via a feature called Channels, which automatically identifies and clusters the themes across massive amounts of data. So a good example of what Nicola said, lighting up dark data. It allows their customers to make more informed decision making and allows those companies to be more responsive to customer needs and emerging trends. And this is the important part: it saves hours of manual analysis. So you've got toil reduction there, and the net is that it reduces the cost to serve insights up from unstructured data. So by organizing around the tasks to be done by your users as opposed to what kind of chatbot are we going to build, you build a much more effective interface for your existing products.

Design Strategy Two: Narrowly Scoped Tasks Deliver Higher Success Rates

Okay, design strategy number two then, which builds on that idea of a task-based approach, focusing on narrowly scoped tasks over broad. If you've spent any time looking at the data that's come out recently, you've probably run across the MIT NANDA report, and one of the things that they also picked up is one of the reasons for so many failures in the investments that organizations were making was that when organizations tried to implement complex tasks that might span multiple organizations or complex workflows, usually those things were harder to implement successfully. In the case of the healthcare management software company that we talked about, if they had focused on narrowly scoped tasks, they might have had a higher rate of success.

So examples like verifying patient insurance eligibility before scheduling an appointment, that's a narrowly scoped task. Checking for drug interactions when prescribing new medications, that's a narrowly scoped task. Generating patient-friendly summaries of lab results like the LIMS example I just talked about, that's a narrowly scoped task. It's going to be much easier to successfully implement those specific use cases.

So the advantages of narrowly scoped tasks include clear expectations of what the AI should accomplish, which makes it easier to set accuracy goals. Narrowly scoped tasks help improve the response quality and the reliability. It also lowers the implementation risk because it allows it, it makes it easier for you to do targeted evaluation of how the LLM is responding. It allows you to do precise testing and more effective prompt engineering around the task itself, and so it also makes it easier to take each individually scoped feature and evaluate it and refine it based on your user feedback. So it's not an all or nothing when you go out to the customer and then go into that iterative process of improvement with your use case.

Let's look at a couple of really good examples. I like to talk about Canva a lot because they've implemented AI capabilities broadly across their product, but if you're a creative user, you might not even realize it because of the way that they have integrated the task capability into the overall flow of the product. As an example, Magic Write generates specific text content like headlines or captions. The image generation tool that they have creates visual assets that support particular design needs. The chat assistant guides users through specific creative decisions. Each of those individual features tackles one task, and it does it really, really well.

Magic Write doesn't try to be a good general writing assistant. Its features are specifically tuned for design-related copy. So this approach delivers two key benefits for the Canva team. First of all, the individual creatives can easily understand what each feature does and what outputs to expect, makes it much more predictable.

Secondly, Canva can optimize each individual feature based on the use case, and the end result is a higher quality experience for their existing users, which incidentally creates a higher price performance opportunity in the market.

Let me give you a second example. It's actually one that we talked about earlier this week in one of the keynotes. If you've ever authored content that you want to share with the world, you know that professional book cover design can be a real barrier. Kindle Direct Publishing sees this problem with its customers. It allows authors to self-publish print and digital books that reach millions of users on Amazon. Now I wrote professionally for 16 years and I can tell you that the creative process was something that I practiced every day and I got pretty good at writing, but that didn't make me very good at graphic design, and I'm still not. Indie authors face this problem all the time.

So what did the KDP team do? It turned to the AWS ACE program, which is a program that we use to work with customers around use cases, specifically generative AI use cases, and they came together to solve this problem in just four weeks. They went from initial concept to a working prototype and integrated with AWS Bedrock as part of that and Adobe Firefly. And so the specific task in this case was the task of cover design. By solving that distinct problem with a very narrow scoped, manuscript aware cover generation with commercial IP protection, the KDP team has transformed a long standing pain point for folks like me into competitive advantage as part of the publishing process. So that's design strategy two.

Design Strategy Three: Matching Task Modality to Requirements with Asynchronous Processing

Eric, I'm going to pass it back to you. Jeffrey, so now let's talk about the third design strategy, and in this case we're matching task modality to requirements. This is something that's going to be important no matter which of the product design strategies you choose. We find all too often that we just default with our generative AI experiences to be synchronous. So users click a button, they start to see the little spinner go, and then they wait for something to happen. It feels familiar because that's how we've often designed our products in the past, but it's not necessarily always the best fit when it comes to generative AI.

And here's the problem. It often requires higher performance characteristics than we would need otherwise, and it also then results in higher inference costs. So now you're forcing your users to wait, and it's actually costing you more money. Asynchronous design patterns are going to be a slightly different way to do this. It's a fundamentally different approach. So instead of making users wait for that, you can actually prioritize the next step in the workflow and eliminate those forced wait times in the process.

So think about it this way. When a user is waiting for some AI generated content, whether it's a book cover, ingesting something else for your operation, do they really need to wait for that, or could they move on and do something else in the background while you're waiting for the AI to complete this? This not only improves the user experience, but it actually addresses several technical challenges that I'll talk about in a slide or two. So these are things like unpredictable processing times, resource management, and cost optimization.

So let's look into this in a bit more detail. So when should you choose synchronous versus asynchronous processing? Synchronous processing, as you might imagine, are things that require real-time interactions or validation. So if you're guiding a patient through onboarding before they can see the doctor, that has to happen before they can see the doctor. It's great for simple queries where you have a predictable response time, say less than three to five seconds, or an interactive conversation. If you were actually creating a chatbot, it would be a really bad experience if you needed to go off in the background, let's just say.

For asynchronous processing, these are going to have the opposite sorts of use cases. So if you think about complex document processing, if we were thinking about processing insurance claims for our healthcare example, this would not be a good fit for synchronous, and so async is going to be our friend here. And another case where we might want to use asynchronous is if we have multi-step workflows. So again, this is going to take a little bit longer, it's a little bit more complex. We don't need to sit there and have our user wait for this to come back.

Another example where asynchronous is going to be a good fit is where we need significant compute power. So if it's going to take a lot of resources to be able to complete this task, it's something that we probably want to do in the background and then let the user come back to it at a later point in time. The key question you need to ask yourself is, does the user absolutely have to wait for this to continue the next step in their workflow? If they do, you're probably going to need to go synchronous. If they don't, you definitely want to look at something like asynchronous request for this.

One of the other benefits of using asynchronous requests is that you can often get access to lower priced inference. So if you think about as of re:Invent 2025 today on a Thursday, a lot of our models have half the cost if you're using batch inference as opposed to on-demand. And very few other places allow you to save 50% just by making a different design choice.

We have other AWS services that can help with that. If you think about using SQS or EventBridge to allow you to know what to do when someone triggers something and then come back to it, those are really good options to help you build these asynchronous flows.

Let's look at a real world example from Tapi. Tapi is a New England property management startup that's using asynchronous approaches to help with automated invoice processing. Here's how their solution works. A business will send an invoice to a specific email address that Tapi provides to them. Then they use generative AI in the background to asynchronously process and extract all the information that's contained within this invoice.

This approach has three distinct advantages that help address some of the challenges that we see with generative AI. The first one is it tackles the latency and performance unpredictability. Model performance can vary depending on the length. If you imagine I've got a short invoice versus a long invoice, the system loads, so how many invoices are coming in at the same time, and some of these things are going to be outside your control. By processing these invoices asynchronously, Tapi is able to take that unpredictability and kind of obscure it from their users, and then turn it into a background process. Their users aren't sitting there and waiting, and then they can continue with the next step in the workflow.

Second, it provides security advantages. If you think about asynchronous processing, you now have time to be able to provide things like content filtering, toxicity checks, and security chains if you do want to have a human in the loop before you send it back to your users. This review before release model is a lot harder to implement if you were using synchronous processing.

Third, it enables resource management, and by that I mean how can we best use the resources available to us from our generative AI capabilities. We can do prioritization, we can implement queuing systems. If we have a burst of load, we can maybe spread those out over time if they all don't need to happen at the same time. Those are all things that would be really difficult to do if we were using synchronous processing.

But probably most importantly, this allows Tapi to give their users a better experience and be respectful of their time. Rather than sitting there and just again watching that spinner go, their users can move on to the next step. Maybe it's sending in another invoice, maybe it's doing something completely otherwise, but they're not now waiting for something that needs to happen that isn't really required from them. With that, I'll turn it over to Nicola for our last two strategies.

Design Strategy Four: Augmentation Over Automation with Human-in-the-Loop Workflows

Thanks Eric. All right, so when integrating AI into your products, how many of you got really excited and think I'm going to automate everything? What happens when with that automation, you're replacing human intelligence and human nuance and deep contextual understanding? What happens when that fails? We did see that in our hypothetical. Thankfully no customers turned up for their phantom appointments, but you can see how you might end up with really frustrated users who are going to lose trust in your products.

So how can you avoid this? Let's look at a real world example. Vital is another New Zealand based startup, and they are revolutionizing patient communication in inpatient and hospital settings in the US. Imagine this, you're a patient, you've turned up to the ER and you're tracking your visit on your mobile phone. A common problem is that you're going to have a lot of tests that get run. The critical issue that you're in there for is going to be followed up, but what happens if there are other incidental findings that you might not, you might leave the hospital not knowing about?

Vital's solution is to actually run the patient notes and tests through an LLM to extract those incidental findings out of that information. To be specific, incidental findings are those outside of what was primarily being looked for within medical image reporting. This again is a really classic example of lighting up dark data. You've got physicians that are able to validate those findings, they can streamline contacting the patient and then update electronic health records automatically.

With this workflow, doctors provide that human touch when they're delivering sensitive medical information. Their expert feedback can be used to further improve incidental findings. Currently, this product services over 3 million patients within the US and they actually surface 94% of incidental findings that would have otherwise been unreported and unnoticed. Critically, Vital estimates that this feature would also take around five physicians per hospital to actually implement. So with AI they're actually unlocking whole new capabilities.

Generative AI doesn't just augment workflows, it also gives you the ability to create whole new workflows that just weren't economically viable previously. Fundamentally, Vital has implemented an augmentation loop, which is a pattern we're seeing a lot of customers adopting. Instead of fully automating the workflow, they have AI generate the incidental findings. The clinician then verifies those findings and is able to communicate to the patient and provide feedback, which then helps to improve the accuracy of those findings over time and build trust.

While full automation might seem like the ultimate goal, augmentation can actually provide much more immediate value and a better long-term result. From a product perspective, augmentation is also going to create a much stickier experience, because the clinicians in this example, or your customers, will build up muscle memory with your feature over time. This leads to higher engagement as customers feel a lot more competent and able with your product.

Augmentation can also create a much more forgiving environment, as you're able to improve it iteratively over time. You can launch the AI feature without having full expectation of perfect accuracy, and users then provide the necessary feedback to improve that over time. This creates a really important cycle of improvement that pure automation is going to lack.

Let's break down that example even further so you can see how automation and augmentation can be combined into one well-defined workflow. Medical notes reveal, using Generative AI, the incidental finding is extracted also with Generative AI before the patient is contacted for an examination. Reminders can be sent. There's a patient visit. Notes are generated and automatically updated in that patient record. You can see here Generative AI is used to augment critical steps around reviewing notes and extracting incidental findings, and we've also got that clinical expert at critical points, speaking to patients about sensitive findings.

We do also have automation involved, so updating that patient record, sending those reminders to the patient. You can see within this workflow we've thoughtfully combined automation, augmentation with AI, and including those clinical experts in the loop. This thoughtful implementation also addresses some of AI's biggest weaknesses around accuracy and consistency because you're catching any errors before they really matter.

Let's drill down even further to see how safety is further involved within this loop. This problem is broken down into three overarching tasks. Let's think back to those previous strategies. They've got clear inputs and they've got clear outputs, and multiple validation gates within each. The workflow is also an asynchronous process. This provides time for guardrails and checks to occur throughout that process.

Pre-processing of clinical notes uses the Amazon Nova Micro model for generating taxonomies and pulling out relevant details. Medical imaging has very specific taxonomies that they're able to rely on here. Vital actually migrated to Amazon Nova Micro from a more expensive third-party model when monitoring and testing showed that they were able to perform the same task with the same level of accuracy at a fraction of the cost.

Every section is then validated against a gold standard of taxonomies produced by clinicians, so if something doesn't match, it's going to be discarded. Regex test is used to validate, do I actually have a response, make sure that there's something there. And then a lightweight AI model also validates, do I have something that's medically meaningful.

The core prompt generation is a bit more sophisticated, so you've got a dynamically constructed prompt. It's going to include more patient details around conditions, medications, and also the reading level, so who's actually going to be reading this result. This is before we also use an LLM to actually judge the outputs. So this is multiple prompts included within this. It's going to be looking at accuracy, it's going to be looking at safety. And the findings are actually graded against a set of criteria that have been validated by clinicians.

So if I'm getting a 5, it means it's highly accurate and contains the information that I expect it to. Down to a 1, not so good, not so accurate, we really need to review this for safety. Studies by Vital have shown that their in-house AI safety evaluation framework has achieved 99% accuracy.

The clinician then reviews, and these findings are then validated against a set of criteria. Does it meet the safety threshold? Are there any sensitivity flags for the types of results that are going to be provided back to the patient? Anything that's particularly sensitive will be sent to the physician to review and then report back to the patient. Citations and referencing to the original reporting can also be included here to help improve the accuracy of those human reviews.

These checks and feedback can really be incorporated again into any automatic evaluations, which are going to be critical for improving your product over time. But how do you know that this augmentation loop is actually improving things over time? Monitoring is going to be your best friend. You need to track those business metrics at a task level in order to see that your feature is actually incorporating all of that feedback that you're being provided and collecting.

Monitoring becomes even more critical when you think that the production environment is not a static environment. Models are going to be deprecated. The GPU allocations for the model that you're using might change, and maybe there's an auto-upgrade of a model that affects accuracy of the results that you're getting. So here we have a very simple example. It's not a real example from a customer. I did make this using Quiro, but it is representative of what I'm seeing with customers.

You can track cost, which is a function of the number of tokens and the cost per token, as well as accuracy. You may also be wanting to track the execution metrics as well. They're also very critical. So if we go back to our extracting taxonomy task, what happens when I introduce a change to that task? For example, what happens if I introduce a new model? Well, you can see at this task level, the cost is shifting down and the accuracy is about the same.

So clearly it's a good change, a positive change, and I'm able to actually track this easily because of the way I've set up my monitoring and my metrics. So remember our first strategy about profitable growth. Monitoring tells you if this feature is actually becoming more profitable over time. When you can easily break down your cost to serve at the task and at the feature level, you can more easily set targets around cost optimization and actually implement some of those strategies that Eric highlighted earlier in the talk.

Design Strategy Five: Designing for LLM Experience (LLMX) with APIs and Model Context Protocol

For those of you who are really interested in monitoring and automated evaluations, I'd recommend checking out the Gen AI Life Cycle Operational Excellence Framework. Had to practice that one a few times. Also GLOW, GLOE. There is prescriptive guidance around that that goes into great detail. So our final strategy represents a real shift in how we design for user experience.

Traditionally you'd optimize your interfaces for human interaction. So you want really clear visual hierarchies, you want intuitive navigation. But when designing your experiences for LLMs, you're now saying, oh, the AI is the intermediary between my customer and the product. When an LLM is consuming and processing information from your product, it's not going to care about where is that button on the page, what color is it. Instead, it needs structured, predictable, and semantically rich data formats.

This means designing your APIs and your data schemas and your system outputs with machine interpretation in mind. So remember our healthcare company, way back at the start of the presentation. Instead of that interesting chatbot, imagine if they designed their appointment system with both humans and AI in mind, with structured data to prevent phantom appointments from appearing and clear schemas to help AI understand availability constraints, so we're not getting our Saturday appointments booked in.

Amazon maintains a relentless focus on the architectural principles that enable rapid scaling and innovation.

One of the most transformative decisions that was made was to have all teams expose their data and their functionality through APIs. This was about creating a company where any team could build on top of the work of another team. The API-first approach enabled Amazon to launch AWS, so we can thank that for our conference today. Amazon had solved the problem of making internal systems accessible, and every service was able to become a product.

Just as Amazon's API mandate broke down internal silos, MCP servers break down the silos between AI models and the tools that they need to actually be useful. Model Context Protocol, or MCP, makes it easy for products to expose their functionality directly to AI agents through a standard interface. A lot of definitions have been thrown around about agents, and I'm sure you've heard many this week, but the one I like best is that agents are models using tools in a loop. I can't take credit for that one. I did hear it on Simon Willison's blog, and I believe it's via Anthropic.

So how can you actually set up your applications to allow these agents easy access? One way companies are doing this is through MCP. Companies are also doubling down on really rich APIs with good documentation. APIs using OpenAPI format already have a really familiar syntax for tool calling, and they can easily be modified to be used by LLMs. So we're witnessing a shift from optimizing solely for that user experience to including the LLM experience, or LLMX.

This transition is apparent in how you approach your documentation. Think of the days of having scattered PDFs, video tutorials, help docs, and FAQs across your whole organization. Those days really are ending as LLMs need comprehensive and structured documentation that they can rapidly parse. So moving to markdown-based documentation isn't just about convenience. It's also about having machine-readable content that has clear semantic structures but still remains human-friendly.

Well-structured markdown allows LLMs to understand relationships between concepts and the logical flow of how your product is meant to work. Think of your documentation as training material for any AI agents that are going to be using your product. You've created comprehensive knowledge bases that AI can instantly access and then synthesize in order to provide relevant guidance. The products that are able to master this LLM experience will have AI agents that can expertly navigate their products from day one.

But how do you reimagine how users are going to interact with your software in this transition from UX to LLMX? This shift requires thinking beyond traditional app boundaries that we're all used to. So let's use an example. Let's consider how I might interact with a location service. I'm very new to Vegas, so I'm looking for everything at the moment. I might search for a place that I want to find, then switch to another panel and try to find other places that are nearby, and then click through to find a bit more detail about those options. You can see I've probably navigated through multiple different UIs and different interfaces in order to get the information that I need, and there's friction at every point.

AI agents using tools and MCP servers can coordinate these tasks for me simultaneously. There's no need to click through different user interfaces, as an agent is able to see which MCP servers and tools are available in order to complete that task for me and present it back. So as a user, I can specify where I want to go, and that agent might use, say, Amazon Location MCP to find nearby options for me without me needing to go through that visit. It can then be presented back to me in a readable format, so that human experience is still very much a part of it here, without me having to traverse through multiple UIs and have multiple points of contact. So the challenge becomes to think about creating capabilities that AI can invoke rather than products that are destinations for users.

This requires thinking beyond the traditional UI/UX and focusing on making your product's core value and core features accessible through intelligent orchestration.

Stacking Strategies for Profitable Growth: Summary and Resources

So with that, I will pass back to Jeffrey to round out our presentation. Thanks, Nicola. So that is a pretty broad tour through a lot of things that when you apply each of these collectively and you stack them, as you can see in the examples, how one after another can yield meaningful differences in the use cases that you're looking to automate. Drive down the cost and enable you to, in a very stepwise fashion, reach profitable growth and increase your odds of success, beat the odds from what the surveys are telling you.

Drive profitable growth with a value-driven approach. It starts with the business case. Focus on tasks over conversations, narrowly scope those tasks. Make sure you pick the right modalities which are going to help you optimize the right cost and optimize the way your customers interact with what you're building. Favor augmentation over automation, and design your experiences for both humans and LLMs.

Do those and you're going to find that your chances of delivering meaningful interactions with your customers, driving meaningful extensions to your product, and driving profitable growth are going to become a lot easier. So with that, we want to thank you for joining us. We'll hang around afterwards if there are any questions. If you're interested in diving deeper, you can start with the QR code there which is going to take you to the follow-up material that we have that will help you expand on the design practices that we've presented here.

Also, we have programs within the ISV segment that are designed to help organizations walk through those use cases, to score them, to understand where you might want to focus your effort, and your Account Manager can help you understand things like the Generative AI Innovation Center, the Agentic Catalyst program, ways that we can help accelerate working through these design best practices. We're looking forward to helping you get on that path to profitable growth.

Last thing, if you have the opportunity, make sure that you continue your learning journey on AWS Skill Builder, especially around these topics with respect to generative AI. So with that, thank you, have a great day, hope you have a great rest of the conference, and thanks guys, we'll stay around.

; This article is entirely auto-generated using Amazon Bedrock.