Gareth McCumskey

Posted on Aug 20, 2019

The difficulty with monitoring AWS Lambda functions (and how to solve it)

#serverless #aws #microservices #monitoring

If you have spent any time building out a microservices application, you have probably quickly run across the problem of monitoring your services, whether they are configured on container-based infrastructure or Serverless. Having all these individually scoped moving parts makes it that much harder to collate and then analyse log files.

Solutions to this problem are pretty broad. One of the more general patterns making its way into the purely microservices realm is the idea of a service mesh usually running as a sidecar module to each service. This pattern provides a consistent method that every service needs to adhere to when it comes to, amongst other features that these service meshes provide, publishing log data. These logs can then be gathered and collated in a single source and useful metrics extracted.

However, in the Serverless world, a service mesh falls short since we are using a large collection of managed services to which we have no means to configure an additional tool for this logging.

So what do we do now? Just give up and assume we will be forced to analyse our CloudWatch logs manually every time an issue arises?

Well, thankfully, no. Recently I started using a tool provided by the Serverless Framework team to include monitoring of my Lambda functions and more. The reason this is so compelling to me is not just because I happen to be a part of the team (but it helps), but also that the implementation is so frictionless. Being the developers of the framework helps in that you can then incorporate this monitoring capability at a very basic level into an existing Serverless service. There is no need to include any additional library into your functions to instrument them. No need to manually add additional IAM permissions (unless you choose to do so to make use of the other features of the software). It kinda just works with minimal setup.

So if you are interested to find out more about the Serverless Framework Dashboard and what it offers besides monitoring, Austin Collins, CEO and founder of Serverless Inc, has put together a great 3 minute video to bundle it all together at https://www.youtube.com/watch?v=-Nf0ui3qP2E, but we are focussing primarily on the monitoring side of things.

How do we get setup for monitoring?

Well, the first step is we need a Serverless Framework Dashboard account. Go to https://dashboard.serverless.com to get that setup. What you will get once done is an org and an app as you can see in this image:

Now open up your Serverless service's serverless.yml in your favourite text editor and add the app and org properties to it. I usually do this above the service property:

app: enterprise-demo
org: garethmccumskey
service: demo-email-form
provider:
...

With that, we are almost there. We need our local machine to be able to authenticate to our Serverless Dashboard account when we deploy. To do that, just run sls login. It will open a window to your default browser to authenticate. Once you see the message

Serverless: You sucessfully logged in to Serverless

on the CLI, we can now deploy.

Run sls deploy just like you would usually do. This is necessary because it is at this stage that the Serverless Framework can now automatically instrument your functions, and subscribe to the CloudWatch logs in your account for the functions in your service.

Now, just a few caveats to point out:

If you have tried out any other monitoring tool that also subscribes to your CloudWatch logs you may get an error about some CloudWatch limit reached. The solution is to either remove that subscription or just send AWS a nicely worded message via the support tool in the console and ask them if they would be so kind as to increase your CloudWatch subscription limits. We've heard they are pretty accommodating with this request.
If you usually deploy via a headless CI/CD system and therefore can't use sls login, then you can grab yourself some access keys instead and set things up as per the docs. You're welcome :)
Ummm, yup I think that's it. Onward!

Open up your service's monitoring in the dashboard by clicking its name and then the stack instance defined by the stage and region it was deployed to. You should see something like this:

If you have any traffic going through that service you should be seeing the graphs responding live to invocations and errors as they happen real time!!

Go ahead! Click around! Take a look at all that this new vista has to offer. But take special note of that alerts section you see on the screen.

Once you've calmed down a little from all the excitement, there's one more surprise in store: notifications. Who wants to sit and stare at graphs all day? You've got stuff to do! So instead, head back to that original view where you could see all your services and select the notifications tab. You should see something like this:

Well, what are you waiting for? Click that link. Its asking you to! And what you should find is the ability to send yourself (or your team) a notification about any of those alerts I mentioned you should take notice of via email, Slack, SNS or even a Webhook if you so choose.

Now you have no excuse when someone asks you if the current average duration of your lambda functions is above normal. If you didn't get the alert then things are fine. What about errors? Then turn on the new error type identified alert notification. Want the whole team to get messages from production but only the devs to get them from the dev stage? You can do that too. Just create one notification limited to the prod stage and the other limited to the dev stage.

And there we go. With that small amount of effort we instrumented an entire Serverless service and were able to get operational metrics about our current invocation rates, durations, errors, memory usage and more. And I forgot to mention you get this all for free up to 1 000 000 invocations per month as a part of the free tier as well so you can kick the tyres extensively.

Personally, I use the Serverless Framework Dashboard across all my own personal projects. It's gotten to the point where I cannot build Serverless projects without having this turned on by default because it makes it so much easier to get the alerts and data I need about my service while I am developing it.

And before I leave you, there is one last thing to mention. A feature that will be released really soon that excites me incredibly. I'll just drop it here as a screenshot :)

Top comments (3)

Ivo Pereira • Oct 5 '19

Great starting point here Gareth! I have been looking for such kind of a solution, however just found out it is not available for Golang runtime. Is there any preview for supporting it as well?

Gareth McCumskey • Oct 11 '19

Hey there Ivo. We have just added Python support and I know support for more runtimes is on the backlog I just honestly I have no idea the exact time frame it will come out. However, you should be able still to get some use out of the dashboard. Some of the advanced features such as error stack traces won't be available for golang but other features will, including deployment profiles as well as seeing invocation counts, etc.

Gareth McCumskey • Oct 18 '19

We will be adding all the runtimes eventually. Python support just came out a week ago and we will continue support for additional. I cannot provide any ETA though I'm afraid.

DEV Community

The difficulty with monitoring AWS Lambda functions (and how to solve it)

Top comments (3)

Read next

How to Master Multi Region Architectures in AWS

RAG Application using AWS Bedrock and LangChain

Observability Maturity Model for AWS

Terraform Test and AWS Lambda