<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jon Goodall</title>
    <description>The latest articles on DEV Community by Jon Goodall (@jdgoodall1).</description>
    <link>https://dev.to/jdgoodall1</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1092908%2F8de42bd6-95df-43e6-8614-0610d6ae76c0.png</url>
      <title>DEV Community: Jon Goodall</title>
      <link>https://dev.to/jdgoodall1</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jdgoodall1"/>
    <language>en</language>
    <item>
      <title>AWS Database Savings Plans – Save Up to 35% – FINALLY!</title>
      <dc:creator>Jon Goodall</dc:creator>
      <pubDate>Thu, 04 Dec 2025 11:07:12 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-database-savings-plans-save-up-to-35-finally-5g68</link>
      <guid>https://dev.to/aws-builders/aws-database-savings-plans-save-up-to-35-finally-5g68</guid>
      <description>&lt;p&gt;It’s AWS Re:Invent right now, and one announcement has me and the rest of the AWS community very excited – AWS &lt;a href="https://aws.amazon.com/blogs/aws/introducing-database-savings-plans-for-aws-databases/" rel="noopener noreferrer"&gt;Database Savings Plans.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I’ve been asking for this for as long as I can remember, probably because I’m a bit dull, and also a little stingy…&lt;/p&gt;

&lt;p&gt;You’re likely wondering why I’m so excited about this, and in no small part, it’s because it makes my life easier. It also gives you, AWS customers, another AWS cost optimisation option to save money on your AWS bill, which is always a good thing.&lt;/p&gt;

&lt;p&gt;Before we get into the details about AWS Database Savings Plans, let’s do a bit of a history lesson.&lt;/p&gt;

&lt;h2&gt;
  
  
  A History of AWS Savings Plans
&lt;/h2&gt;

&lt;p&gt;All the way back in 2019 AWS released “&lt;a href="https://aws.amazon.com/savingsplans/compute-pricing/" rel="noopener noreferrer"&gt;Compute Savings Plans&lt;/a&gt;“, and I’ve been a fan since day 1. They make saving money on “compute” (namely, EC2, Fargate and later &lt;a href="https://dev.to/aws-builders/aws-lambda-use-cases-when-you-should-use-it-5e2e"&gt;Lambda&lt;/a&gt;) much easier.&lt;/p&gt;

&lt;p&gt;Before the Compute Savings Plan was released, if you knew that you were going to keep the same server for 1-3 years, you could lock in a commitment using a Reserved Instance (RI). Savings of 20% were common, and savings of 30% or more were possible. But, if you had plans on moving to “modern compute”, you were a bit stuck. Sure, you could do it, but you’d be paying for the RI you’d committed to for the duration of the term, even if you weren’t using it. This was a real barrier to modernisation, because nobody likes paying twice.&lt;/p&gt;

&lt;p&gt;This is an oversimplification, as convertible Reserved Instances exist, which let you trade them for other types. Reserved Instances also “roll up” smaller-sized Reserved Instances to cover larger servers. This has a caveat, though – it only applies if there’s no licence fee built into the hourly spend (sorry, Windows users). But in essence, you were stuck managing servers.&lt;/p&gt;

&lt;p&gt;You could move to containers, but you had to keep using EC2 to run the containers on, which was a headache and added engineering time.&lt;/p&gt;

&lt;p&gt;Compute Savings Plans are different, though – you commit to an hourly spend and save money. Literally, that’s it.&lt;/p&gt;

&lt;p&gt;OK, it’s a spend within the three supported services (EC2, Lambda and Fargate), but so long as you’re spending money on one of those things, you’d be getting the discounted pricing. Purchase terms are the same – commit to longer and pay more up front to save more money. However, the 0% upfront 1-year plan is incredibly compelling, so I default to recommending it.&lt;/p&gt;

&lt;p&gt;Compute Savings Plans aren’t perfect, though. My biggest gripe was that they didn’t support database spend. You might argue it’s a storage service, not a compute service, but tell a developer that. The line between “storage” and “compute” is so thin you can see through it at this point. My second biggest issue was the hourly commitment rather than the daily one. With a daily commitment you can account much better for flexible workload trends, but you can’t have everything.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cool, history done, what’s new?
&lt;/h2&gt;

&lt;p&gt;As of the 2nd of December 2025, announced to great cheers from the audience in Matt Garman’s re:Invent 2025 keynote, AWS Database Savings Plans are a thing! This is “A Very Good Thing“. He really did save the best til last, with only 2 seconds left on the ‘shot clock’!&lt;/p&gt;

&lt;p&gt;AWS Database Savings Plans work in a very similar way to Compute Savings Plans – commit to an hourly spend in “supported usage” and save money.&lt;/p&gt;

&lt;p&gt;Purchase terms are similar, but currently you can only commit to a 1-year term (come on AWS, give us 3 years!), and we’re also only offered a no up-front payment option at lauch. I’d love to see increasing discounts available for committing for longer, and paying up front as you can with Compute Savings Plans. Despite this, there are still some serious discounts available here, and the best bit is it covers serverless too – with a discount of up to 35%! That’s massive and really cements the idea for me that you should start on serverless options until your per-hour cost outweighs the “scale to zero” benefit. AWS are also pushing ‘&lt;a href="https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/manage-advancepay.html" rel="noopener noreferrer"&gt;Advance Pay&lt;/a&gt;‘ as a way to pay up front for your database services, but there’s no discount for doing this, so I’m not sure why you’d bother.&lt;/p&gt;

&lt;p&gt;They also have day 1 support in Savings Plan Purchase Analyzer – I waxed lyrical about this on an episode of the &lt;a href="https://www.youtube.com/@logicata" rel="noopener noreferrer"&gt;Logicast AWS News Podcast&lt;/a&gt;, so this is a really nice thing to have on day 1. You’d better believe if it didn’t have it, I’d be complaining about it!&lt;/p&gt;

&lt;h2&gt;
  
  
  Sounds Great, What’s the Catch?
&lt;/h2&gt;

&lt;p&gt;The AWS Database Savings Plan isn’t a perfect offering, as it still suffers from my second gripe of Compute Savings Plans – the hourly vs. daily commitment.&lt;/p&gt;

&lt;p&gt;It does also muddy the water a bit, as we now have four different types of Savings Plan:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compute Savings Plan&lt;/li&gt;
&lt;li&gt;EC2 Savings Plan&lt;/li&gt;
&lt;li&gt;Database Savings Plan&lt;/li&gt;
&lt;li&gt;SageMaker Savings Plan&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is very easy to solve though, AWS just needs to release an overall “Savings Plan” that covers Compute &amp;amp; Database, whilst dropping the EC2 Savings Plan offer. I’ve never found them useful between Reserved Instances and Compute Savings Plans, but maybe some people do. I also don’t use SageMaker to have a considered opinion on SageMaker Savings Plans, so they get to stay for now.&lt;/p&gt;

&lt;p&gt;I’m sure it’s not very easy for AWS to do this, for a myriad of internal and technical reasons, but we can but dream.&lt;/p&gt;

&lt;p&gt;Now, onto the “supported services” list. This is confusing. It covers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;RDS&lt;/li&gt;
&lt;li&gt;Aurora&lt;/li&gt;
&lt;li&gt;DynamoDB&lt;/li&gt;
&lt;li&gt;ElastiCache&lt;/li&gt;
&lt;li&gt;DocumentDB&lt;/li&gt;
&lt;li&gt;Neptune&lt;/li&gt;
&lt;li&gt;Keyspaces&lt;/li&gt;
&lt;li&gt;Timestream&lt;/li&gt;
&lt;li&gt;Database Migration Service (who’s running that for a whole year?).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a massive list of services for day 1 – remember Compute Savings plans only covered EC2 &amp;amp; Fargate at launch.&lt;/p&gt;

&lt;p&gt;However, it’s not all spend within those services that counts. Got a Redis cluster? Sorry, only Valkey is supported. Using a t4g RDS instance? No discount for you. In fact, anything that uses ‘servers’ is only eligible to be included in a Database Savings Plan if it’s using the latest instance types (r7g, m7g, m7i, m8g, etc). This is very frustrating, as many people I’ve worked with need a 24/7 non-prod environment, but only need t4g instances, for example.&lt;/p&gt;

&lt;p&gt;The serverless offering somewhat redeems this, as it’s just per-CU (Capacity Unit) hour. This is a much better offering than the current option of “nothing” and goes a long way to solving for “I can’t do serverless, it’s more expensive under consistent load”. This has been a real issue for me personally, as I’m a big advocate of serverless-first, but I couldn’t honestly recommend it for production workloads. Either the warmup time was too long for transactional workloads, or the constant throughput was too expensive without being able to make a committed purchase.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts on AWS Database Savings Plans
&lt;/h2&gt;

&lt;p&gt;I’m willing to forgive the complicated in-scope vs. out-of-scope spend on this one, considering the vast array of services that are covered. Also this is a V1 offering so I’m sure it will evolve to include more services, and more payment options, as per the other savings mechanisms.&lt;/p&gt;

&lt;p&gt;This also doesn’t solve for “do I buy a Reserved Instance or an AWS Database Savings Plan”, but the gap is closing, and I’m looking forward to seeing more things come into scope in the future. AWS have committed to including newly released instance types as they become available, so I’ll just have to upgrade my boxes, I guess.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>database</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Building a Serverless Podcast Workflow: Adventures with AI</title>
      <dc:creator>Jon Goodall</dc:creator>
      <pubDate>Mon, 30 Dec 2024 14:09:50 +0000</pubDate>
      <link>https://dev.to/aws-builders/building-a-serverless-podcast-workflow-adventures-with-ai-l1f</link>
      <guid>https://dev.to/aws-builders/building-a-serverless-podcast-workflow-adventures-with-ai-l1f</guid>
      <description>&lt;p&gt;As you may know, I’m a co-host &amp;amp; standing guest on the &lt;a href="https://www.logicata.com/follow/" rel="noopener noreferrer"&gt;Logicast AWS News Podcast&lt;/a&gt;, where we discuss all things in the news about AWS.&lt;/p&gt;

&lt;p&gt;What you probably don’t know, is that the preparation &amp;amp; production of a podcast is rather a lot a work, and that anything to speed up &amp;amp; simplify the process is absolutely necessary – especially for a weekly podcast.&lt;/p&gt;

&lt;p&gt;On the preparation side, being about recent AWS news helps because we don’t have to do as much work on research side – we just turn up &amp;amp; record. This doesn’t help us on the production side though, which is still a lot of work.&lt;/p&gt;

&lt;p&gt;To give you a flavour, the things we have to do every week are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pick the articles (and ideally read them)&lt;/li&gt;
&lt;li&gt;Share with the guest and answer any questions they might have&lt;/li&gt;
&lt;li&gt;Record the episode&lt;/li&gt;
&lt;li&gt;Download the files&lt;/li&gt;
&lt;li&gt;Convert the files into the correct formats&lt;/li&gt;
&lt;li&gt;Create a trailer&lt;/li&gt;
&lt;li&gt;Create a summary (the “show notes”, in podcasting parlance)&lt;/li&gt;
&lt;li&gt;Upload to the publishing platform&lt;/li&gt;
&lt;li&gt;Social promotion&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On top of this, because “content is king”, we want to be able to re-use the episode as much as possible. Our current wishlist is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create short “clips” for social posting to “drip feed” the content and drive subscribers&lt;/li&gt;
&lt;li&gt;Create a long-form blog post from the recording, that isn’t just a transcript&lt;/li&gt;
&lt;li&gt;Add full subtitles to each video.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As with any problem, there were a few options to solve both the required tasks, and start on the wishlist.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1: Outsource it.&lt;/strong&gt;&lt;br&gt;
This looks like a combination of things, from hiring a production &amp;amp; marketing person (much love to Alicja for the work she does), to using 3rd party tools to help with some of the creation (I’m not linking the tool, because they’re not paying us, but we use an AI service for clip/trailer creation)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 2: Automate All The Things&lt;/strong&gt;&lt;br&gt;
Obviously I want to do this, because I’m an engineer, and in my head my time is free. I’m sure Logicata disagrees with me here though…..&lt;br&gt;
However, throw in the fact that we “needed” a reason to talk about AI, we thought we’d better have a go at doing “something”.&lt;/p&gt;

&lt;h2&gt;
  
  
  Enter the Workflow
&lt;/h2&gt;

&lt;p&gt;Now, I’m a Serverless AWS Community Builder, so obviously I went straight for Lambda &amp;amp; Step Functions here. I started playing around with options, and for once, doing some research. I know! Didn’t see that coming either.&lt;/p&gt;

&lt;p&gt;Up in this rarified research-fueled air, I found this AWS blog: &lt;a href="https://aws.amazon.com/blogs/machine-learning/create-summaries-of-recordings-using-generative-ai-with-amazon-bedrock-and-amazon-transcribe/" rel="noopener noreferrer"&gt;https://aws.amazon.com/blogs/machine-learning/create-summaries-of-recordings-using-generative-ai-with-amazon-bedrock-and-amazon-transcribe/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This was a really good foundation for what we needed/wanted to build, it even had a sample project at the time which let me short-circuit hours of dev time&lt;/p&gt;

&lt;p&gt;After a bit of tweaking, I came up with this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faj0h1afym9s4nlmuh3r2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faj0h1afym9s4nlmuh3r2.png" alt="Serverless AI Workflow V1" width="376" height="772"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Yes, that’s a big scary image, so let’s break it down.&lt;/p&gt;

&lt;p&gt;The process is:&lt;/p&gt;

&lt;h5&gt;
  
  
  Step 1: Kick-off with File Upload
&lt;/h5&gt;

&lt;p&gt;We start by uploading an m4a file to an S3 bucket, and use the bucket notification to trigger the workflow.&lt;br&gt;
I have to download the files from the recording platform, which isn’t a problem, but is a bit annoying.&lt;/p&gt;

&lt;h5&gt;
  
  
  Step 2: Media Conversion
&lt;/h5&gt;

&lt;p&gt;AWS Elemental MediaConvert transforms the m4a file into an mp3 format, ready for Spotify and other platforms.&lt;br&gt;
We have to do this because the recording platform delivers an m4a, but most audio platforms prefer an mp3.&lt;/p&gt;

&lt;p&gt;This is a fire-and-forget approach, so I’m manually checking for the job completion and downloading the file afterwards. Again, not a problem but somewhat annoying.&lt;/p&gt;

&lt;h5&gt;
  
  
  Step 3: Transcription
&lt;/h5&gt;

&lt;p&gt;Amazon Transcribe converts audio into a text-based JSON document. This is actually the most expensive part of the process, which I didn’t expect at the outset.&lt;/p&gt;

&lt;h5&gt;
  
  
  Step 4: Run the prompts
&lt;/h5&gt;

&lt;p&gt;Amazon Bedrock reads the transcription and generates summaries and titles using prompts stored in DynamoDB.&lt;br&gt;
Since building this, prompt manager became a thing, because as everyone knows the best way to get AWS to create a new feature is to build it yourself first.&lt;/p&gt;

&lt;p&gt;This is all in one Lambda, using a loop in Python. I regret this enormously but it was the quickest option.&lt;/p&gt;

&lt;h5&gt;
  
  
  Step 5: Outputs
&lt;/h5&gt;

&lt;p&gt;The final outputs are sent to an SNS topic for easy access.&lt;br&gt;
We have a Slack channel email subscribed to the topic, so the messages aren’t lost in inboxes.&lt;br&gt;
We went with SNS &amp;amp; email both because the baseline I used was already doing it, and I couldn’t be bothered to work out the schema for AWS Chatbot. I should probably do this though.&lt;/p&gt;

&lt;p&gt;Obviously this isn’t our full wishlist, or even the complete set of required tasks. However, with careful prompting, it does make the required tasks a lot faster to do. The summary is a good prompt to create the show notes &amp;amp; the LLM creates the title – sometimes we use it, sometimes not.&lt;/p&gt;

&lt;h2&gt;
  
  
  There must be some problems though?
&lt;/h2&gt;

&lt;p&gt;You would be correct there. The issue comes back to my time – it’s only free in my head. Turns out, building this sort of thing takes rather a lot of time &amp;amp; effort, so it has a number of “rough edges”.&lt;/p&gt;

&lt;p&gt;Chiefly:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. It’s really fragile.&lt;/strong&gt;&lt;br&gt;
Seriously, one dodgy prompt, or an episode that runs a touch long, bang. All falls over, nothing comes out the end. Lately we’ve been hitting rate limits too, presumably because we’re on an ancient version of Claude.&lt;br&gt;
This is mostly because I’m hacking it together, and not spending a proper amount of time on it&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Transcription is expensive, and the workflow must restart on error.&lt;/strong&gt;&lt;br&gt;
Again, because it’s fragile, the re-runs have to start at the beginning. What’s worse, because most of the failures are prompt-based and errors in the model invocation are only checked after-the-fact in a downstream task, I can’t take the offending prompt out and re-drive from the failure, thus forcing a full re-run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. It’s kinda slow&lt;/strong&gt;&lt;br&gt;
Nothing doing here, it’s just slow. No parallelisation of the prompts (due to the aforementioned bad Python loop), and a single lambda taking every output response and dumping it onto SNS at the same time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Improvements.
&lt;/h2&gt;

&lt;p&gt;OK, we’ve run this for a few months (eek), and I’ve even delivered a whole talk on it (see that &lt;a href="https://youtu.be/IUSKn8YZn68?si=69rANdfwM27XWEzq&amp;amp;t=2386" rel="noopener noreferrer"&gt;here&lt;/a&gt;), I should probably do something about fixing these rough edges. This was the list:&lt;/p&gt;

&lt;h3&gt;
  
  
  Improvement 1:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Update the model:&lt;/strong&gt;&lt;br&gt;
Annoyingly the interface between Claude versions has changed, so some faffing around is needed here.&lt;/p&gt;

&lt;h3&gt;
  
  
  Improvement 2:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Use Prompt Manager:&lt;/strong&gt;&lt;br&gt;
What it says on the tin. No more dodgy DynamoDB table for the prompts, use the service properly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Improvement 3:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Fix the bad loop:&lt;/strong&gt;&lt;br&gt;
Take the loop through the prompts out of a single lambda, and run them all as single Lambda’s, called using a Map state.&lt;br&gt;
This also solves for speed, as the prompts are the second slowest part of the workflow&lt;/p&gt;

&lt;h3&gt;
  
  
  Improvement 4:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Less fragility:&lt;/strong&gt;&lt;br&gt;
Through the judicious use of “ignoring errors”, we want to be able to run all the prompts and get outputs, even if one (or most) of them fail.&lt;/p&gt;

&lt;h3&gt;
  
  
  Improvement 5:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File conversion result in Slack&lt;/strong&gt;&lt;br&gt;
Still using SNS -&amp;gt; email, but now we’re checking for the conversion job, creating an S3 pre-signed URL and sending that to the SNS topic as soon as it’s available. The pre-signed URL lasts for a couple of hours.&lt;/p&gt;

&lt;h2&gt;
  
  
  Yes yes, Show us a picture:
&lt;/h2&gt;

&lt;p&gt;Fine, something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsk3lyd0p9w51nf573g7a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsk3lyd0p9w51nf573g7a.png" alt="Image description" width="800" height="1031"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Did it work?
&lt;/h2&gt;

&lt;p&gt;Well, no. Not quite.&lt;br&gt;
With the improvement list as a starting place, I ended up here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzc1ek8k8htyhlkq7muu1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzc1ek8k8htyhlkq7muu1.png" alt="Serverless AI Workflow V2" width="800" height="727"&gt;&lt;/a&gt;&lt;br&gt;
Bigger and scarier than before I know, but it can’t be helped – we’re just doing more stuff now.&lt;br&gt;
Let me walk you through it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Steps 1-3: Kick Off:
&lt;/h3&gt;

&lt;p&gt;All still the same – kick off with the upload of the m4a file, trigger the transcoding &amp;amp; transcription.&lt;br&gt;
Only slight tweak is the transcoding trigger actually returns the job id, so I can use that later.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Parallel State:
&lt;/h3&gt;

&lt;p&gt;Now we actually use the parallel container I built earlier and split into two branches – one for the transcription &amp;amp; LLM invocations, and the other for the transcoding.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5 (Transcoding Branch):
&lt;/h3&gt;

&lt;p&gt;Nothing massively clever here, just a loop in the step function based on an if/else/continue premise to check for the status of the transcoding job.&lt;br&gt;
If it’s not done, loop around again and wait some more, if it failed, if it’s completed generate a pre-signed url and send to SNS, if it failed send an error to SNS but don’t halt the step function.&lt;br&gt;
This last bit is important – as far as possible we’re not halting the step function for errors in the process, especially on the lower-value task.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5 (LLM Branch):
&lt;/h3&gt;

&lt;p&gt;Now we grab the prompts in their own Lambda, but they’re still from DynamoDB, because I couldn’t fathom prompt manager in the few evenings I had to spend on this.&lt;br&gt;
Same goes for the direct SDK integration between DDB &amp;amp; Step Functions really.&lt;/p&gt;

&lt;p&gt;I’m sure some of the Serverless DevAdvocates, AWS Heroes &amp;amp; Community Builders I know would dislike me for this, but I didn’t see the benefit of it here.&lt;br&gt;
Using Lambda Powertools in Python, grabbing the list is 4 lines of code. Plus I’d already written said code in v1, so I kept it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6 (LLM Branch):
&lt;/h3&gt;

&lt;p&gt;Much the same as before, but with another loop – not a map state.&lt;/p&gt;

&lt;p&gt;It turns out the rate limiting wasn’t because Claude v2 is ancient. It’s because all of Bedrock has really low rate limits.&lt;br&gt;
This means that we’re not solving for speed, but we are solving for rate limits, with an unlimited number of prompts, so that’s something.&lt;/p&gt;

&lt;p&gt;On each execution of the “Invoke Bedrock Model” lambda we’re dropping the prompt we ran from the list of them, as a quick-and-dirty for loop. With some time this could be cleaned up a bit, but for now it works.&lt;/p&gt;

&lt;p&gt;Also, we’re using Claude 3.5 Sonnet V1, and have designs on both V2 (or V3.5 Opus when that eventually comes out), and Amazon Nova Pro, as the outputs in the console looked encouraging.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 7 (LLM Branch):
&lt;/h3&gt;

&lt;p&gt;You’ll notice that a couple of states have been removed, namely the direct SDK integration with SNS for sending the results, and the “end error” state.&lt;br&gt;
This reduces the re-run cost by allowing me to re-drive the state machine from the point of error in the case of hitting a rate limit – which was 90% of our errors in v1.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 8 (LLM Branch):
&lt;/h3&gt;

&lt;p&gt;Back around to the iterator we go, but this time with an arbitrary 2 minute sleep.&lt;/p&gt;

&lt;p&gt;This gets us around the 1 invocation-per-minute rate we’re working with, but I could do something a bit smarter here with error code checking &amp;amp; exponential backoff.&lt;/p&gt;

&lt;p&gt;The iterator is well-trodden at this point – just check if the prompts list still has prompts in it, and go around again. If it’s now empty, finish the branch.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 9 (Both Branches):
&lt;/h3&gt;

&lt;p&gt;End.&lt;br&gt;
Both branches are now done, so we close out.&lt;/p&gt;

&lt;h2&gt;
  
  
  So, how is this better?
&lt;/h2&gt;

&lt;p&gt;Well for one I don’t have to sit and wait for the transcoding to finish. The pre-signed URL is dropped straight into Slack for me to grab, so that’s nice.&lt;/p&gt;

&lt;p&gt;Also, we can run an unlimited number of prompts, and shouldn’t get rate-limited anywhere near as often – if we do, re-drive from failure covers the restart without having to re-do the expensive transcoding &amp;amp; transcribing.&lt;/p&gt;

&lt;p&gt;The updated model performs loads better, and because the interface for all Claude v3/3.5 models is the same, I have a route to make each prompt run under a different model – which I thought was the idea behind Bedrock to start with, but seems to be harder than I thought it would be.&lt;/p&gt;

&lt;p&gt;Also we have monitoring, sort of.&lt;br&gt;
I put a small Cloudwatch Alarm on the failures of the step function (well, I had Q Developer write it actually, can’t avoid using AI in this project), which also sends to the same SNS topic. That way I can just upload the file to S3, and get on with other things, without having to babysit the workflow.&lt;/p&gt;

&lt;p&gt;And of course I have an update on the project I can write a talk for, so I best start shopping that around local meetup groups I guess.&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s Next?
&lt;/h2&gt;

&lt;h4&gt;
  
  
  Expand the Iterator
&lt;/h4&gt;

&lt;p&gt;I still want to be able to run a different model for each prompt, because models aren’t one-size-fits-all, and I’d like an easy way to test lots of different models on the same prompt.&lt;/p&gt;

&lt;h4&gt;
  
  
  Use Prompt Manager
&lt;/h4&gt;

&lt;p&gt;Still not using this, and I really should be.&lt;/p&gt;

&lt;h4&gt;
  
  
  Resiliency
&lt;/h4&gt;

&lt;p&gt;We’re in a better place than we were, but it’s still not as good as I’d like it to be. Ideally we’ll handle rate limit exceptions via a retry and exponential backoff, plus have proper alerting rather than a single alert for the whole Step Function.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finally, What did we learn?
&lt;/h2&gt;

&lt;p&gt;Rather a lot, as it happens.&lt;/p&gt;

&lt;h4&gt;
  
  
  Time &amp;amp; Effort Needed
&lt;/h4&gt;

&lt;p&gt;Phase 1 showed the sheer amount of effort needed to get these things going, even with a big jumping-off point from AWS. This is compounded by the fact that this is a marketing/hobby project, so doesn’t get a lot of time spent on it. Phase 2 just compounded that lesson – it took the best part of a day to make the change, split across several evenings, and it’s not that different from phase 1.&lt;/p&gt;

&lt;h4&gt;
  
  
  Pace of Change
&lt;/h4&gt;

&lt;p&gt;The pace of change within LLMs is really high – between v1 &amp;amp; v2 there were 6 different models released just for Claude, so keeping up with the current models is a challenge all by itself. Once you start thinking about other model providers (looking at you Amazon Nova), it’s a whole different challenge.&lt;/p&gt;

&lt;h4&gt;
  
  
  LLMs are Non-Deterministic
&lt;/h4&gt;

&lt;p&gt;So we knew that already from the documentation, but in practice it can be really frustrating to not have a consistent output between executions, and you need to be aware of it when developing against them.&lt;/p&gt;

&lt;h4&gt;
  
  
  Skillset
&lt;/h4&gt;

&lt;p&gt;By day I’m an SRE/Platform Engineer/Generalist Cloud Engineer, not a developer and certainly not an AI/LLM expert, so this was a challenge both dusting off my Serverless developer skills whilst learning how to interface with Bedrock. Fortunately AWS have done a really good job of making it an easy service to consume, and I highly recommend you start with the chat interface in the console to test your prompts.&lt;/p&gt;

&lt;p&gt;The other recommendation I’d have for you is to dive in – after v1 I gravitated much more towards AI/LLM talks and workshops at the various AWS conferences I’ve attended this year (London Summit, London Partner Summit, Re:Invent), which I got much more out of for having a baseline level of knowledge, thanks to this project.&lt;/p&gt;

&lt;h4&gt;
  
  
  LLMs Have Rate Limits
&lt;/h4&gt;

&lt;p&gt;Well, yes, you might say. However I didn’t appreciate just how low they are in Bedrock.&lt;br&gt;
When you think about it, it makes sense, and our usage puts a very high number of tokens through the model in a short space of time. But you do need to be aware of them, and handle them appropriately in your own implementations.&lt;/p&gt;

&lt;h2&gt;
  
  
  So, What are My Tips?
&lt;/h2&gt;

&lt;p&gt;Hopefully you can learn from my mistakes here, but if you want to short-circuit this whole “learning by doing” thing, I’d recommend:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Go to a couple of workshops before getting going.&lt;/strong&gt;&lt;br&gt;
They don’t have to be in-person, and could be watching something on YouTube after-the-fact, but for a good portion of phase 1 I struggled with just understanding the new terminology I needed&lt;br&gt;
&lt;strong&gt;2. Test your prompts&lt;/strong&gt;&lt;br&gt;
I said this above but it bears repeating – use the console to test your prompts and see what sort of output you’re likely to get. It’s much cheaper to do this than run a whole transcription &amp;amp; transcoding workflow for the sake of changing a single prompt.&lt;br&gt;
&lt;strong&gt;3. Try to do model evaluation&lt;/strong&gt;&lt;br&gt;
I didn’t do this, because the project I based on had already done it, and settled on Claude2. I regret not going through the process to get a better understanding of why Claude2 was the correct choice at the time though. You’ll also learn a lot about the various models in the process, which might be useful for another project.&lt;br&gt;
&lt;strong&gt;4. Request model access up front&lt;/strong&gt;&lt;br&gt;
The “non-AWS” models aren’t instantly approved when you request them, so save yourself some time and request them early.&lt;br&gt;
&lt;strong&gt;5. Check the rate limits&lt;/strong&gt;&lt;br&gt;
These are different between models &amp;amp; regions, so you can’t assume that the same thing will work if you port it to another region&lt;br&gt;
&lt;strong&gt;6. Be aware of time&lt;/strong&gt;&lt;br&gt;
If you’re new to LLM development, this is a learning curve that you’ll need to climb, so be patient with yourself. Doubly so if you’re not a developer by day.&lt;br&gt;
&lt;strong&gt;7. Learn by doing&lt;/strong&gt;&lt;br&gt;
Hopefully by reading this you can go further and faster than I did, but there’s no substitute for building things when trying to learn.&lt;/p&gt;

&lt;p&gt;To wrap this up I think I’ll quote Amazon CTO Dr. Werner Vogels:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Now, Go Build&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>serverless</category>
      <category>ai</category>
      <category>podcast</category>
    </item>
    <item>
      <title>AWS Lambda Use Cases: When You Should Use It?</title>
      <dc:creator>Jon Goodall</dc:creator>
      <pubDate>Wed, 31 May 2023 14:13:41 +0000</pubDate>
      <link>https://dev.to/aws-builders/aws-lambda-use-cases-when-you-should-use-it-5e2e</link>
      <guid>https://dev.to/aws-builders/aws-lambda-use-cases-when-you-should-use-it-5e2e</guid>
      <description>&lt;p&gt;Lambda, and Serverless in general, is rather “in” right now in the world of cloud computing. If you listened to all the marketing coming out from the big names about it (and yes, I’m guilty of this too); you’d expect that you can run your whole service on it. For next-to-nothing, with no downtime, and your deployments would be as smooth as silk.&lt;/p&gt;

&lt;p&gt;So, how much of the marketing spiel should you listen to – how do you know when to use Lambda? Well, I’m going to try and come up with a reasonable list of use cases for AWS Lambda, so that’s a good place to start.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsr05x6kgmsnqxmcrnl48.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsr05x6kgmsnqxmcrnl48.png" alt="AWS Lambda Logo" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Lambda?
&lt;/h2&gt;

&lt;p&gt;Before we get into “what’s it for”, it’s worth defining “what it is”, so let’s do that.&lt;/p&gt;

&lt;p&gt;AWS Lambda is AWS’s take on “Function as a Service” (FaaS). It allows developers to run code without provisioning or managing servers. With AWS Lambda, developers can upload their code, and the service will take care of the rest – including scaling, patching, and availability.&lt;/p&gt;

&lt;p&gt;The idea behind AWS Lambda is to make it easier for developers to build scalable, event-driven applications that run on the cloud. The service is highly-available and fault-tolerant. Which means that it can handle large amounts of traffic without crashing or experiencing downtime. One of the key benefits of using AWS Lambda is that it is fully managed. This means that developers don’t have to worry about managing hardware or operating systems. They can focus on building applications, while AWS Lambda takes care of the underlying infrastructure.&lt;/p&gt;

&lt;p&gt;AWS Lambda supports a variety of programming languages, including Java, Python, Node.js, C#, and Go. This makes it easy for developers to write code in the language they are most comfortable with, without having to learn a new language or platform.&lt;/p&gt;

&lt;p&gt;Another advantage of AWS Lambda is that it provides automatic scaling. This means that the service adjusts the number of functions serving requests. If there is a sudden increase in traffic, AWS Lambda will scale out the number of functions to handle the load. Conversely, if there is a decrease in traffic, the service will scale in the functions to reduce costs.&lt;/p&gt;

&lt;p&gt;AWS Lambda is also cost-effective. Developers only pay for the compute time that their code actually uses. This means that if an application isn’t in use, there are no costs associated with running it. Additionally, since AWS Lambda scales based on demand, developers can avoid over-provisioning and paying for unused resources.&lt;/p&gt;

&lt;p&gt;That’s all fine and good, but what exactly do you use it for?&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS Lambda Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Use Cases for AWS Lambda #1: Glue
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F431fre5ap0rnw6torcmm.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F431fre5ap0rnw6torcmm.jpeg" alt="Arts and Crafts with Glue" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now I’m not talking about AWS Glue but rather using Lambda to “glue” (or “stitch” if you prefer) other AWS services together.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why would you use Lambda for this?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A few reasons.&lt;br&gt;
Lambda can bridge two services that don’t talk to each other. For instance, when an API Gateway call is made to retrieve a file from an S3 bucket, Lambda can facilitate the interaction between the two.&lt;/p&gt;

&lt;p&gt;In some cases, the services do talk to each other, but filtering the results can be challenging. For instance, API Gateway and DynamoDB. Yes, API Gateway can talk to tables, but it’s not easy to work out how or to combine queries from several tables into a single result.&lt;/p&gt;

&lt;p&gt;Event-driven architectures. Say you wanted to process an image after uploading to S3, you could send a notification to a queue and have an EC2 instance handle the message processing. Or do this with Lambda, because it’s only charging you when it’s running. In the same vein, you can use Lambda can act as a replacement for cron-triggered scripts, again saving money.&lt;/p&gt;

&lt;p&gt;“Gluing” things together is a lot of the work I’ve seen/done used with lambdas as they’re quick to build &amp;amp; deploy and cheap to run. In most of the deployments I’ve seen, the CloudWatch bill for monitoring the lambdas was higher than the bill for the lambdas themselves.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Cases for AWS Lambda #2: APIs
&lt;/h3&gt;

&lt;p&gt;So, you can’t use Lambda’s as APIs by themselves but put them behind either API Gateway or an ALB, and you can.&lt;br&gt;
Most APIs are “call-and-response” in that a client calls an endpoint for “something,”. This could be data, to kick off background processing or anything else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why would you use Lambda for this?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once again, it’s about cost and resource utilization.&lt;br&gt;
Lambdas can be permanently provisioned and react in very short spaces of time. So they complete the processing and respond to the user for a much lower cost than a server or container can.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Cases for AWS Lambda #3: Websites:
&lt;/h3&gt;

&lt;p&gt;This one is a bit out there but go with me on it.&lt;/p&gt;

&lt;p&gt;Most “modern” websites consist of dynamically constructed pages. The pages are rendered and served in real-time, per request to the user.&lt;br&gt;
Most webpages don’t do complex processing on the same thread that is serving the page to the user, as this improves the user experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why would you use Lambda for this?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For exactly the same reasons as using it for the backend or an API.&lt;br&gt;
You don’t have to have a server, which might only get occasional use, and instead, only pay when people are using the service. You also have a lot less to worry about when it comes to scaling to meet demand, as the lambda service does this for you.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Cases for AWS Lambda #4: Data Processing &amp;amp; ETL
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6mne68geccos5cfslzue.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6mne68geccos5cfslzue.jpeg" alt="Data Processing" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is similar to the “glue” use case but different enough that I thought it deserved its own section.&lt;br&gt;
ETL (Extract, Transform, Load) is the process of taking data from one data source, changing its format or adding content, and loading it into another data storage platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why would you use Lambda for this?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A couple of reasons, depending on your requirements:&lt;/p&gt;

&lt;p&gt;Lambda can be triggered directly from other AWS services. Meaning when data is added to one of the sources, the processing starts quickly in response.&lt;br&gt;
For instance, you could subscribe your Lambda to the event stream from a DynamoDB table, which allows the Lambda to start working within 1 second of the data being added.&lt;br&gt;
Lambda also scales in response to demand, so if you have a period of a large volume of data being added to sources, it will be able to keep up and keep feeding your data warehouse in near-real-time.&lt;br&gt;
Lambda can be called by AWS Step Functions, which allows for complex processing from multiple data sources, whilst being able to break the logic down into very small component parts. This can make the development easier to break up between team members and improve the testability of the system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Cases for AWS Lambda #5: Containerized Workloads
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Figcv49y41c5qu1k2q8qe.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Figcv49y41c5qu1k2q8qe.jpeg" alt="Image description" width="800" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Again, out of left field on this but bear with me.&lt;br&gt;
Since Docker was added as a Lambda runtime environment, you can use Lambda to run anything you’d run in Docker.&lt;br&gt;
The caveat is that it must complete within 15 minutes and use less than 10GB of RAM. I know I touched on this at the start of the article, but it’s definitely worth going into more detail on this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why would you use Lambda for this?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We’re back on the same benefits again – cost and complexity. I’ve done the cost thing a few times now, so I’ll skip that and go to the complexity piece instead.&lt;br&gt;
Running Docker-based workloads in a highly-available manner is difficult and requires some level of orchestration (e.g. Docker Swarm, ECS or Kubernetes.&lt;br&gt;
Managing the orchestration tools is a job in and of itself. Yes, AWS can take some of that away in their managed services; but your engineers still need to understand how to manage the tools.&lt;br&gt;
With Lambda, that goes away as it scales to meet demand. Additionally, deployment is as trivial as uploading a Docker image (though you really should be using a CI/CD setup).&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Cases for AWS Lambda #6: ChatBots &amp;amp; Voice Assistants
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F97shciomgqa6toigc8uf.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F97shciomgqa6toigc8uf.jpeg" alt="ChatBot" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ChatBots are on almost every website these days, so I don’t think I need to explain them. If you have a customer service setup of some sort, you either already have a ChatBot on your website, or are thinking/have thought about it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why would you use Lambda for this?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Because it can interface using the API/SDK with things like Lex and Polly, you can use Lambda to get data from APIs or other areas of your infrastructure and send them back to the user via the bot.&lt;br&gt;
I can’t promise that your users will actually like the bot, but that’s more to do with the information it’s sending back than the technology.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;The hype is more or less correct, and you can use Lambda to run almost anything for a low cost. The biggest drawback is that you have to reframe your thought process. It’s a bit of a mental jump to think about serving web pages out of the same service that you’re using to shuffle data around in your backend, but it can be done.&lt;/p&gt;

</description>
      <category>lambda</category>
      <category>serverless</category>
      <category>containers</category>
      <category>aws</category>
    </item>
  </channel>
</rss>
