If your Lambda function has to access a Database ( or any other service that requires credentials) where and how do you store that configuration?
...
For further actions, you may consider blocking this person and/or reporting abuse
This is a fantastic blog post! Learned a lot and really appreciate you explaining this for all of us! Great work @dvddpl π
glad to hear that π. thanks
I prefer doing .gitignore .env + .env.example for ease of use and possibility to pass it to lambda even without a file.
SSM is great and all, but its a whole lot of setup to do a key store value, and when you need to use it in any serious manner, you need devops to add it to the stack as well.
And if this post target group is people who hardcode credentials in the codebase, i can already tell you, they wont go with the SSM hassle, for sure. ;))
true! :-)
my main problem with the .env file is that even though you don't commit the file to the repo (and if you are fine with having the credentials in plaintext in the AWS Console) somehow you have to keep those credentials somewhere. where do you keep them? how do you share them with your coworkers?
i dont think SM / Secrets manager require a lot of setup. probably we are still using a naive approach but we have a couple of scripts in the package json to generate credentials and create the secrets in SSM. then we rely on serverless to handle permissions and other stuff.
When im developing a function i keep it locally (or i keep it in lambda configuration, you can pass env.* - docs.aws.amazon.com/lambda/latest/... ).
When it gets plugged into the stack (the serious approach), its prepared somehow on the fly by scripts provided by our devops inside docker container before deploying the function.
Maybe your code can pass a security review that way; mine canβt. Corporate dev rules laugh at this approach.
which of the above exactly?
I use a similar setup for my containers running in Fargate. I added a piece to my docker run script that grabs the SSM parameters and saves them as env vars when the container starts up. Thanks for pointing out a nice way to handle this using Lambdas
But then they are exposed to all apps with the Fargate execution space. Security now has to move away from the OS container and towards the app itself.
What do you mean by "exposed to all apps with the Fargate execution space?" Each application has its own image, with its own run script (bash). The run script makes the request to AWS SSM and sets the environment variables before it starts the application. The secrets are only available in the container OS. The app can only read them.
AWS gives you few options to sort this out. I think your best shot is SSM. If you can't use SSM for some reason you can use S3 as well, and apply a policy similar to the one you use to access SSM.
Great walkthrough, I liked it and thanks for preaching the no-env-vars for secrets!
Exactly what I've done with the serverless framework I've built (aegis). I used secrets manager, though I'm interested in the parameter store too (didn't know about it or maybe it didn't exist before?).
Curious what you do for caching.
I wish they provided something directly within Lambda itself.
Thanks for sharing!
the caching is nothing fancy. just a simple map where i store the retrieved key and an expiration time ( like.. 5 minutes) and everyime the lambda is invoked i check if the key i have is expired - if so, i refresh it reloading it from SSM. Of course it works only among the same container - but it could save up a lot of time and money anyway.
our case right now is simple, but the caching could definetely be implemented better, with multiple keys with different expiration times - and probably i would need to think about the case when you update the secret and you have still containers running - trying to use the old key from the cache...
Hey Davide thank you for the article and I started my application and planned to use SSM as my secret store and I had the same solution in my mind to fetch and cache but the cache which I thought is not matching with
getParametersByPath
api due to its async nature so can you please share the gist which has caching implememnted on lambda that will help me alot and without knowing how to cache for this async nature api I got stuck and my application implementation got blocked.Vault is a great option if you've already got the infrastructure.
Vault is my preferred solution for anything key related. Extendable to everything still relying on keys, not just lambdas.The vault plug-in for Jenkins is a life saver.
since many comments mentioned vault I googled for comparisions and found this interesting article: epsagon.com/blog/aws-lambda-and-se... which also touches the aws limits on ParameterStore.
IMO, vault should only be used in enterprises. Preferably a dedicated team just to handle vault
I've tried to use ssm parameter store like this but ran into an unpublished and unchangeable usage limit on it. If your lambda will see a lot of traffic, be wary.
currently our lambda has not such traffic issue and i doubt it will scale too much in the future. but i could not find such limitation in the docs. what was it about? Were you using some way of caching the retrieved parameter among lambda invocations - that i would say should decrease the requiest to ssm.
This is what support told me late last year:
"We currently do have limits on Parameter Store API but due to the dynamic nature of limits, the values have not been made public yet and so I would not be able to provide you with an exact number at this moment. I agree it is frustrating not knowing what the imposed limits are for the service. The service team is aware of the situation and they are currently working with our documentation team to publish the limits. Unfortunately this work is still in progress and we are unable to provide an ETA when this will be completed."
Oh.. Wow. Not good. But how was that limit reached? How many invocations? All on cold starts? No caching? What was your workaround/ Alternative solution?
Check this aws.amazon.com/about-aws/whats-new...
The SSM API in general has a low throttle limit. When we first implemented SSM, I hit the throttle limit while deploying a CloudFormation template that invoked 4 nested templates in parallel, each attempting to deploy 15 parameters. Eventually got it working by explicitly setting dependencies in the template so CloudFormation was forced to deploy the parameters in serial.
I can also vouch for the need to cache the retrieved values in the Lambda container to avoid hitting throttle limits at retrieval time.
we recently hit the cloudformation limit too and add to start using the nested templates.. it was a "nice" surprise when we could not deployed for the 200 resources limit error. i will probably write something about that too :-)
What is the 'some sort of caching' you implemented? because Lambda function are stateless, so memory caching on the runtime is not very useful cause lambda constantly creating and destroying containers for their concurrency. Is there a way of caching on stateless functions?
Do you share credentials between applications that are in the same environments? Or, do they each have their own per stage? Same goes for developers, do they each have their own set in SecretsManager?
in our case the credentials were not specific for users rather for the lambda itself to operate against a DB instance. I would personally handle the develpercredentials differently.
Unfortunately the project grew over time and we did not start with a monorepo, so yes, we ended up with the credentials for each env shared by 3 different applications. that's why was handy to use SecretManager. 3 Secrets for 3 stages and no need to worry how many app will then use them. :-)
Auth via IAM... aws.amazon.com/premiumsupport/know...
Interesting. Will definitely read more about it. TX!
Have you been successful using this? One of my coworkers tried setting it up for MySQL RDS and had a lot of trouble.
Worked fine for me. Also with Aurora (not serverless), though.
The only thing that I think might be an issue for some people is the new connections/second throttling. Once you go above 250/second then things start to get unstable.
no. no problems so far. process was quite straigthforward. ( at least with aurora serverless )
Am I missing something here ? Because the AWS best practices are to put your db in a private subnet of your VPC so it can't be reached even if the credentials are stolen right? So in your article when you say'handle configuration safely' what are trying to imply? This is not to say I'm supporting being sloppy is fine. but I kinda didn't understand the use case are you worried that people sometimes use same credentials somewhere else too and it's a big compromise ? in our company we pretty much use the environment variables section in Lambda console we never had a issue and if somebody somehow gets the credentials I still have the VPC coming into the rescue
security is typically "an onion", so it's a matter of how much hardening that you want. if you store the credentials as lambda, then anyone who can view the lambda details can gain access to those credentials - is that fine for the environment that you work in? if you have a few people with a high amount of trust, then that might be fine.
there's also the notion of bad/rogue actors to consider - just because you're in a private vpc doesn't really mean that no one can connect to it. if someone gains access to an ec2 instance that can reach the database server - how much more work do they have to do to get at your data? if the credentials are in environment variables, probably not a lot.
what if you start VPC peering - do you fully trust all of the other VPCs you're peered with to not be compromised?
what if one of your devs had their laptop stolen and their access keys are on there? could an attacker launch an ec2 instance, connect to it, then access your DB?
some people go through to the lengths of putting fake credentials in environment variables; then they up the logging on the RDS instances to record login failures. steam that to lambda, and if you find a failed login attempt, you can then search your instances/lambda for where those credentials (login name) are in the environment variables and hibernate that compromised resource. sort of like a honeypot. to me, that sounds like a lot of wasted time for most scenarios, but for highly sensitive or critical systems, it might be worth it.
thank you very much for your detailed comment. of course as almost everything in programming it depends on the specific use case and requirements.
I like the example of security as an onion and reminds me of the swiss cheese paradigm. ( every layer of security might - and will - have holes. we must be sure these holes don't align ).
The last part of your comment is also somehow similar to what described in the AWS Security Workshop i attended at the Serverless Days i blogged about here.
our use case was simply that we have different restapi and etls accessing the same db. or different db for different stages - accessed by lambdas from different environments. therefore we found quite messy dealing with lots of env.whatever-stage files. duplicated in multiple repos. Secretmanager solved our issues
the lambda being hacked and credentials being stolen might be paranoid, dunno. i read it / saw it in the video and struck me, therefore i mentioned it as well. :-)
I'd go for KMS. :)
at first i thought that too. but then i found SecretsManager ( with the automatic rotation) very handy. Docs state that Secrets Manager integrates with AWS Key Management Service (AWS KMS) but honestly i didnt really where would the difference in using kms directly really lie.
The difference on the surface is in pricing:
KMS: $1/key/month, $0.03/10,000 requests
Secrets Manager: $0.40/secret/month, $0.05/10,000 requests
But the practical difference is Secrets Manager integration into services like RDS, Redshift, and DocumentDB, where rotating the secret will automatically update the corresponding passwords in the database.
yep. slightly more expensive, but i find the integration and rotation very very useful.
I'm not opposed to SSM, but I like using CredStash for secrets. It's easy to use from the cli and in all sorts of code
How to use secret files, not strings? Like .key, .cert and .pem files.
What about the performance part? I am using this for Lambda and looking at Xray it takes around 1.3 sec to get the db credentials which seems to bottleneck? Is that same in your case as well?