<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jye Cusch</title>
    <description>The latest articles on DEV Community by Jye Cusch (@jyecusch).</description>
    <link>https://dev.to/jyecusch</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F772332%2Fa50f338a-a5f0-4c3d-a30f-5ea0b1edc7fa.jpeg</url>
      <title>DEV Community: Jye Cusch</title>
      <link>https://dev.to/jyecusch</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jyecusch"/>
    <language>en</language>
    <item>
      <title>Contributors to AWS Lambda container cold starts</title>
      <dc:creator>Jye Cusch</dc:creator>
      <pubDate>Wed, 14 Sep 2022 08:25:53 +0000</pubDate>
      <link>https://dev.to/jyecusch/contributors-to-aws-lambda-container-cold-starts-2081</link>
      <guid>https://dev.to/jyecusch/contributors-to-aws-lambda-container-cold-starts-2081</guid>
      <description>&lt;p&gt;&lt;a href="https://nitric.io"&gt;Nitric&lt;/a&gt; allows developers to build serverless functions that run on various compute services from multiple cloud vendors. One of the most commonly used is AWS Lambda. One of AWS Lambda's strengths - dynamic scaling - can also be a challenging drawback due to a phenomenon known as Cold Starts.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are cold starts?
&lt;/h2&gt;

&lt;p&gt;When sending the first request to a Lambda Function, that function will be &lt;em&gt;cold&lt;/em&gt;, meaning there are no active instances available to handle the request. So, instead of immediately processing the request, the Lambda service must start an instance before handling the request. This startup time, where the request is idle waiting to be processed, is generally known as a cold start.&lt;/p&gt;

&lt;p&gt;You can guarantee a single cold start after deploying a new function because instances start in response to requests. However, there are some less predictable times that they'll occur. For example, if no requests are made for more than 5 minutes, there is a high likelihood that the previous instances have terminated and a new instance will be started. Also, during times of increased load, when the number of incoming requests exceeds the capacity of the current instances, new instances will be started to handle the volume, leading to additional cold starts.&lt;/p&gt;

&lt;h2&gt;
  
  
  So, how bad are cold starts?
&lt;/h2&gt;

&lt;p&gt;Well, it isn't easy to say. Several factors impact the cold start performance of functions on Lambda. We've experienced cold starts from &lt;em&gt;40 milliseconds&lt;/em&gt; to more than &lt;em&gt;25 seconds&lt;/em&gt; for various functions and containers. This variability means cold starts can be a non-issue, invisible to your end-users, or a contributor to timeouts and other degraded user experience issues for the first user to make a request to a cold function.&lt;/p&gt;

&lt;p&gt;Understanding the causes of, and some solutions for, cold start performance can alleviate these issues.&lt;/p&gt;

&lt;h2&gt;
  
  
  What impacts cold start times?
&lt;/h2&gt;

&lt;p&gt;Our testing and research identified four primary contributors to cold start times:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Operations performed during initialization&lt;/li&gt;
&lt;li&gt;Instance size (memory and CPU)&lt;/li&gt;
&lt;li&gt;The size and quantity of files read during initialization&lt;/li&gt;
&lt;li&gt;Internal cache behavior within the AWS Lambda service&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  One quick note about &lt;em&gt;containers&lt;/em&gt; on AWS Lambda
&lt;/h4&gt;

&lt;p&gt;To maximize cross-cloud compatibility, the &lt;a href="https://github.com/nitrictech/nitric"&gt;Nitric Framework&lt;/a&gt; builds functions as containers, then deploys to services like AWS Lambda. So, we're particularly interested in the cold start performance of &lt;em&gt;containers&lt;/em&gt; running on Lambda.&lt;/p&gt;

&lt;p&gt;Using containers on Lambda is quite different from &lt;code&gt;zip&lt;/code&gt; file deployments. Containers provide more control but also more pitfalls. Consequently, the details below may or may not be relevant if you're building regular Lambda functions; however, most are applicable in &lt;em&gt;both&lt;/em&gt; scenarios.&lt;/p&gt;

&lt;h3&gt;
  
  
  Operations during init
&lt;/h3&gt;

&lt;p&gt;This factor is probably the easiest to reason about, test, and optimize. Any operations your functions perform before handling incoming requests will impact their cold start times.&lt;/p&gt;

&lt;p&gt;Containers deployed to Lambda should minimize any code or processes that execute &lt;em&gt;before&lt;/em&gt; calling out to the Lambda runtime and starting to handle requests.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Functions deployed as &lt;code&gt;zip&lt;/code&gt; files have far less control over the initialization process, meaning initialization will primarily be influenced by the &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html"&gt;runtime&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Here is a &lt;em&gt;Hello World&lt;/em&gt; example in Go to highlight where slowdowns could occur:&lt;/p&gt;

&lt;h5&gt;
  
  
  &lt;code&gt;main.go&lt;/code&gt;
&lt;/h5&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"context"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/aws/aws-lambda-go/events"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/aws/aws-lambda-go/lambda"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;APIGatewayProxyResponse&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;StatusCode&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;      &lt;span class="m"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;            &lt;span class="s"&gt;"Hello World"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;IsBase64Encoded&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c"&gt;// Any code that runs here, before lambda.Start(), impacts cold starts&lt;/span&gt;
    &lt;span class="n"&gt;lambda&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Instance size and fractional vCPUs
&lt;/h3&gt;

&lt;p&gt;Something unusual about serverless computing services, like AWS Lambda, is the idea of fractional vCPUs, meaning the allocation of &lt;em&gt;less&lt;/em&gt; than 1 vCPU to a function. Lambda function vCPUs scale proportionally with their memory allocation (higher memory increases the vCPU allocation and core count). AWS Lambda currently allocates the &lt;em&gt;equivalent&lt;/em&gt; of 1 vCPU when configuring a function with 1,769 MB of memory.&lt;/p&gt;

&lt;p&gt;Luc van Donkersgoed produced an excellent article about &lt;a href="https://www.sentiatechblog.com/aws-re-invent-2020-day-3-optimizing-lambda-cost-with-multi-threading?utm_source=reddit&amp;amp;utm_medium=social&amp;amp;utm_campaign=day3_lambda"&gt;Optimizing Lambda Cost with Multi-Threading&lt;/a&gt;, where testing showed that Lambda functions always have access to 2 or more vCPU cores. So, in the case of a 1,769 MB memory allocation providing the equivalent of 1 vCPU, the 1 vCPU limit is imposed across 2 vCPU cores with a form of CPU throttling.&lt;/p&gt;

&lt;p&gt;This table shows the vCPU and CPU Ceiling results found in the article above:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Memory&lt;/th&gt;
&lt;th&gt;vCPUs&lt;/th&gt;
&lt;th&gt;CPU Ceiling&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;832 MB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;0.50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1769 MB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3008 MB&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;1.67&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3009 MB&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;1.70&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5307 MB&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;2.39&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5308 MB&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2.67&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7076 MB&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;2.84&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7077 MB&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;3.86&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8845 MB&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;4.23&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8846 MB&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;4.48&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10240 MB&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;4.72&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The result is that instance size can impact both the cold start and processing time of a function and that the number of processes or threads in your application plays a part in the outcome.&lt;/p&gt;

&lt;h3&gt;
  
  
  Image size, package size, and I/O
&lt;/h3&gt;

&lt;p&gt;In our testing, we've found that while the size of a function or container &lt;em&gt;can&lt;/em&gt; impact cold starts, what appears to be more impactful is how much of that data is read during the initialization step and the total number of files accessed. For example, reading a single large file or importing from a bundle improves the cold start times over reading many files or dependencies individually. &lt;/p&gt;

&lt;p&gt;Additionally, larger files also impact performance. For example, importing a specific file from a library tends to be far quicker than importing the entire library.&lt;/p&gt;

&lt;p&gt;This appears to be caused by lazy loading of image layer data, particularly during container initialization, and latency introduced by the read operations. In general, you want to access as few files and as little data as possible during the initialization of your functions. For example, we've seen improvement when using &lt;a href="https://github.com/vercel/ncc"&gt;ncc&lt;/a&gt; to bundle Node.js applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cold, warm and hot
&lt;/h3&gt;

&lt;p&gt;Next, we need to talk about one of the most significant impacts on cold start performance in AWS Lambda: image layer caching. The first invocation of a newly deployed Lambda will be &lt;em&gt;significantly&lt;/em&gt; slower than subsequent cold starts, sometimes tens of seconds slower. The reason for this is that Lambda caches container images zonally as well as on individual workers.&lt;/p&gt;

&lt;p&gt;During the first cold start, all of these caches will be cold, but subsequent cold starts typically hit one of the faster caches, resulting in much better cold start performance. Here is an example of some results from a simple Node.js application running in a container on Lambda:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;First request in a region (cold caches, no running containers)&lt;/td&gt;
&lt;td&gt;9.56s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Subsequent cold start (warm caches, no running containers)&lt;/td&gt;
&lt;td&gt;2.67s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All subsequent requests (running containers)&lt;/td&gt;
&lt;td&gt;&amp;lt; 0.10s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Luckily, there are easy options to mitigate the impacts of cold caches on cold starts. AWS recommends &lt;a href="https://aws.amazon.com/blogs/aws/new-provisioned-concurrency-for-lambda-functions/"&gt;provisioned concurrency&lt;/a&gt;, which keeps instances of your functions running, ready to respond to requests. This option has the added benefit of eliminating cold starts until you hit the provisioned concurrency threshold.&lt;/p&gt;

&lt;p&gt;Another quick option is a schedule that makes periodic requests to your functions to keep the function and caches warm. Using the Nitric Framework, this only takes a few lines of code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;schedule&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@nitric/sdk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Execute the function every 5 minutes to keep it warm.&lt;/span&gt;
&lt;span class="nx"&gt;schedule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;keep-warm&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;every&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;5 minutes&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Your existing function code should live here...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Do cold starts matter?
&lt;/h2&gt;

&lt;p&gt;Now that we've looked at the causes for increased cold start times and a few mitigation options to reduce them. The final question is how much effort should you spend mitigating the impact of cold starts on your application?&lt;/p&gt;

&lt;p&gt;It's easy to think that cold starts are a big problem during development. It's disheartening to make your first request to an API and need to wait a few seconds for a response. It's also easy to think it'll be the same for your users. So, to improve the user experience, you go deep down the rabbit hole of cold start optimization.&lt;/p&gt;

&lt;p&gt;Luckily, the reality is quite different. Applications with sustained load rarely cold start, and when they do, it's trivial to deal with the latency in a user-facing application through good UX design. As Allen Helton points out in his blog post &lt;a href="https://www.readysetcloud.io/blog/allen.helton/lets-stop-talking-about-serverless-cold-starts/"&gt;Let's Stop Talking About Serverless Cold Starts&lt;/a&gt;, most teams don't see cold starts as an issue in production.&lt;/p&gt;

&lt;p&gt;We'd love to hear your experiences with cold starts. Are they a problem for your users? What techniques have you used to mitigate the impact? Come chat with us on &lt;a href="https://discord.gg/Webemece5C"&gt;Discord&lt;/a&gt; or &lt;a href="https://twitter.com/nitric_io"&gt;Twitter&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>docker</category>
      <category>serverless</category>
    </item>
  </channel>
</rss>
