A checklist example to check before releasing a web service
Application
Test coverage
- [ ] Is the test coverage 60% or higher?
- In general, 60% is “acceptable”
Log
- [ ] Is there a log in CloudWatch Logs?
External monitoring
- [ ] Is the stability of top page monitored by StatusCake?
- Polling at least every 15 minutes
Vulnerability
- [ ] Vulnerability diagnosis for externally released services.
- Diagnostics by AWS inspector, IBM APP SCAN, etc. have been performed and the vulnerability has been fixed by the day of release
Load test
- [ ] Has a load test been performed?
Infrastructure
Log (ex. CloudWatch)
- [ ] Have you set how many days to retain logs?
Load Balancer (ex. ALB)
- [ ] Is the response time monitored in NewRelic?
- [ ] Is the load balancer 5xx monitored in NewRelic?
App/Batch Server (ex. EC2)
- [ ] NewRelic's APMs are installed in your project.
- [ ] CPU/memory/disk is monitored in NewRelic
Container Orchestrator (ex. ECS)
- [ ] Check the hard limit of memory set in the task definition of ECS not to exceed the upper limit of the ECS instance memory.
Database (ex. RDS, ElastiCache)
- [ ] RDS CPU/memory/disk is monitored in NewRelic
- [ ] ElastiCache CPU/memory is monitored in NewRelic
- [ ] Are you sure it's not provided to the public?
Network (ex. VPC, SG, WAF, S3)
- [ ] VPC's IP range is not the same as your other AWS VPCs
- [ ] Is there a clear separation between the Public and Private segments?
- [ ] If it is an internal service, is SG's ingress limited to your organization?
- [ ] Set WAF for services that are released to the public.
- [ ] Are all S3 buckets isolated from the outside world?
DevOps
- [ ] Are you able to run build/test/deploy with CircleCI?
- [ ] Are you able to manage your cloud resources with Terraform?
- [ ] Is there an on-call system like Opsgenie?
- Be able to receive the following error notifications by phone or other means and be ready to write a playbook/post-mortem after recovery
- external monitoring
- 5xx of load balancers
- Disk space remaining in EC2
- Disk Space in RDS
- Amount of memory remaining in ElastiCache
- Be able to receive the following error notifications by phone or other means and be ready to write a playbook/post-mortem after recovery
Top comments (1)
Great list Kent!
Kindly permit me to add security.
Security
[] Is security built into the pipeline.
[] Where do you keep your images, hope they are not public?
[] Install git secrets so you wont commit secrets to repo.