What are the goals here?
Define a set of advice for working with Python in the context of AWS.
In what context is this guide written?
Best practices with Python…
- As a scripting tool — Boto3 calls to AWS APIs.
- As a language for Lambda in “Serverless” consumable — Lambda function code, with 3rd party dependencies.
- As a language for Lambda in infra pipeline — Housekeeping, CFN Custom Resources. Compilers, Runners, Deployment.
Required Reading
Documentation
Python space
[https://www.python.org/dev/peps/pep-0008/]
[https://www.python.org/doc/sunset-python-2/]
AWS Space
[https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html]
[https://docs.aws.amazon.com/lambda/latest/dg/lambda-python.html]
[https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html]
Best Practices for Python Versions
- Python 2 has been Sunsetted — the sunset date has now passed; it was January 1st, 2020.
- Be mindful of external dependencies between Python 2 and 3. Conclusion: Use Python 3 unless you can’t.
Best Practices for Tooling
a. SublimeText3
- https://realpython.com/setting-up-sublime-text-3-for-full-stack-python-development/
- yamllint
- cfn-lint
- https://github.com/mgaitan/sublime-rst-completion — See “Magic Tables” — so good!
b. VSCode
Best Practices for CLI (i.e. Bash)
Use the “if main” statement
- Put your script’s logic into a “main” method.
- Then use the “if main” convention to call that method if invoked via the CLI (i.e. python3 myscript.py)
if __name__ == '__main__':
main(__get_args())
Best Practices for AWS APIs
a. Use Boto3
- Has its own retry logic IIRC
- Configurable
b. Cater for failure scenarios
Configuring Boto3
from botocore.config import Config
__sts_client = boto3.client(
'sts',
config=Config(connect_timeout=15, read_timeout=15, retries=dict(max_attempts=10))
)
Best Practices for AWS Lambda
You can write python code that can work either as CLI or via AWS invocation
a. Can put a “cli.py” alongside main.py, and invoke that way
That simulates an AWS Step Function (which usually invokes the Lambda)
b. Can put an “if main” statement for running via CLI
Be mindful of limits
a. Account level
- 250 ENIs per Account (soft limit, talk to your AWS TAM)
- 1,000 concurrent executions
b. Use provisioned concurrency to avoid cold-start workarounds
- Needs Hyperplane
Best Practices for Serverless
Packaging code
- Create a self-contained zip for each lambda, AFTER installing pip modules
Code:
S3Key:
Fn::Pipeline::FileS3Key:
Path: consumer.zip
- Be mindful of AWS Lambda limitations on max sizes [https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html]
Using common-pip-* modules
- A way to share modules between lambdas in the SAME repo
- Can reference 3rd party modules
- Can reference common-pip modules
Example:
dnspython
git+https://git-codecommit.ap-southeast-1.amazonaws.com/v1/repos/common-pip-log@master#egg=common-pip-log
And the usage in your main.py code:
- Set the first import path to be the "lib" subfolder of the lambda
from os.path import dirname, abspath
from os import environ
import sys
sys.path.insert(0, "{}/lib".format(dirname(abspath(__file__))))
import log
log.info('env', dict(os.environ)) # os.environ is not serializable by itself, cast to dictionary.
Best Practices for Code and Design
The Zen of Python
“Write with future developers in mind” — they have to clean up your messes.
Note that “future developer” might very well be you, 6 months later after you’ve forgotten everything you did.
Here’s Python’s:
TODO — Go through each one and give reference examples.
C:/Users/ak>python
Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
>>>
Write your code as modules
- Be mindful what gets executed when your python file is imported
- This has a direct impact with Sphinx, which imports your code to generate documentation of it
- This allows you to share code, perhaps even creating a “common” module that other programs can import.
- Be careful — that makes you the de-facto maintainer of that module!
- It needs examples
- It needs unit tests
- It needs a CICD process
- It needs a todo list of enhancements (I use README.md to start with)
Reduce the scope of your module interface
- Use “non-public” (aka “private”) naming convention for internal attributes and methods NOT intended for use outside the module
- This should be your default position, then you slowly refactor stuff to public, as needed over time
[https://www.python.org/dev/peps/pep-0008/#method-names-and-instance-variables]
- Use a single leading underscore for “non-public” method names
E.g.
_get_file_contents
Dealing with strings as booleans
- You might be passed a string property that is supposed to be a boolean. The value might be boolean-ish — true/True/TRUE/1/Yes/On/etc.
Use something like this:
def _s_to_bool(input):
"""Implicit default of false."""
return input.lower() in ['1', 'true', 'yes', 'on']
When writing comments, focus on the “why” rather than the “what”
Nothing more frustrating when code doesn’t explain why something has been done — you need context!
Example, see “ScanIndexForward” below:
response_iterator = dynamodb_paginator.paginate(
TableName='core-automation-master-api-db-items',
IndexName='parent-created_at-index',
ExpressionAttributeNames=expression_attribute_names,
ExpressionAttributeValues=expression_attribute_values,
KeyConditionExpression='parent_prn = :v1',
ProjectionExpression="#p,#n,#s,#c,#u",
ScanIndexForward=False # Process newer builds first (descending order) - important for logic!
)
If that comment didn’t make it clear to future developers that there’s a reason for ScanIndexForward=False, a bug may be created in future.
Consider the strategy pattern for running code in different contexts
- I.e. maybe you use a strategy with your log module so that you don’t output logs locally in JSON, but in Cloud you do, for CloudWatch
- Another example — in AWS Lambda context, you get credentials from AWS Secrets Manager, or Parameter Store. Locally, you rely on environment variables instead.
Best Practices for Data Modelling and Access
- “Upsert” is a good feature at the low level
- For Dynamodb — for scripts I generally don’t bother with ORM/etc, I just write Boto3 API calls
- For example use with Marshmallow + PynamoDB.
Python libraries to help
- https://marshmallow.readthedocs.io/en/stable/
- https://pynamodb.readthedocs.io/en/latest/
- https://www.sqlalchemy.org/
Best Practices for Testing
- Use pytest for unit-testing
Good feature set:
- Auto-discovery of tests
- Fixtures
- Plugins
- Coverage reporting etc
Use Selenium Bindings for Python in CodeBuild
Best Practices for Dependency Management
Use pipenv to explicitly manage and validate dependencies
- Helps to keep your dependencies consistent via lockfile (i.e. repeat builds of same code on different days)
- Lockfile also has checksum feature to ensure the correct package is downloaded in future (i.e. can detect future compromises)
Best Practices for security
- Use the “safe” methods for YAML/JSON/XML parsers. Example:
client = sys.argv[1]
with open('../../{}-config/hosted-zones.yaml'.format(client)) as f:
client_vars = yaml.safe_load(f.read())
Best Practices for User Documentation
a. Use Sphinx
Use an editor plugin to help with formatting, especially tables
b. Focus on a few key areas:
Goals / context
High level design
Use cases
Working examples for people to pull apart and re-use
Top comments (0)