Madalin Ilie

Posted on Jul 6, 2021 • Edited on Jul 7, 2021 • Originally published at ludovicianul.github.io

An incomplete list of practices to improve security of your (micro)services

Software security is hard and complex. Many people think about it as something aside from the typical development process. It's usually seen as a responsibility of
some security people that only care about security and don't understand that we need to deliver business value fast in an already complicated microservices-event-driven-api-frist-ha-cloud ecosystem.
I could add a lot more dashes to microservices-event-driven-api-frist-ha-cloud.
And I think this is the main reason it might seem overwhelming to think early about security and all the possible cases something or someone can break your system.
It's yet another complex thing in an already complex environment. It's not just all the technical complexities of modern architectures, it's also all the additional stuff: need to go to market fast, hard deadlines, team chemistry issues, underperformers, too many processes, meetings, etc.
And it's a complex thing that will not break your system in day 1. It might take months/years until someone will find a vulnerability. Why focus on this from day 1?
Well, you might be right. The chances of something happening from day 1 are low.
And it's very tempting to focus on something with immediate value (actual functiola features), rather than mitigating some future possibilities.
The thing is, when a security issue happens, it can bring your entire system down. And this will be very bad for you and your users.

I see it similar to airport security. We do all these checks, we scan people, we forbid them to take things onboard and so on, although 99.99..9% of people don't plan to hijack the plane.
It's for the 0.00..1% of the cases that we have all these measures in place. Because the consequences are big.

So how do you balance between not over-engineering security and be paranoiac about everything while still focusing on the business value?
You make it a mindset, rather than a separate concern. I'm not saying that everyone needs to become a security expert and know everything.
I'm saying that people should develop secure software just like they develop software. They do it in a way that will minimize the probability of introducing vulnerabilities.

The best way to instill this mindset is through a set of standards and practices that will create habits.
Going back to airport security, you don't let all the decisions on each individual security person.
"This person looks nice. Let them have the scissors, and a knife in their hand luggage". "You sir look very dehydrated, you can take your big bottle of liquid with you in the plane!"
You create a set of rules, procedures (i.e., standards) that will apply equally to everyone. And you also create a set of guidelines (i.e., practices) on how to handle specific situations: if you see something suspicious in a hand luggage, you inspect it separately.

In the next sections I'll detail standards and practices that cover the entire SDLC.
They are not meant to be self-sufficient for all sections (i.e., you might add a lot more to cover that section from a general good practices perspective). But they will make you questioning things and think about cases that are not maybe that obvious.

Where is Security focused

I'll do a simple split of Security concerns into two main areas:

infrastructure security: anything related to how the application is being deployed and operating in production
application security: anything related to how the application is being implemented, with the specifics of the business context

There are plenty of resources on how to tackle both:

PCI Security Requirements: focused on financial software, but can be used as best practice for any other system
Safe Code: general practices on dev, testing, arch, devops
SEI Top 10 Secure Coding Practices: secure coding practices focused on specific programming languages, but also some general ones
OWASP Cheatsheet Series: cheatsheets for most of the software development areas
plus many others

They are lengthy, comprehensive and include a lot of details and practices on how to tackle security in SDLC.
It will be great if every developer will go through all these periodically in order to keep their information fresh. But in practice, this doesn't happen quite often.
I'll try to summarize below which are the most important things to consider, agnostic of the business domain. It's not a full list, nor a silver bullet.
But it will establish a solid foundation which will minimize the possibility for security issues to happen.

Tackling Security

Infrastructure Security it's more predictable to address, mainly due to the use of products or cloud services.
They already have the security features build-it and implemented well i.e., if you use a Web Application Firewall, you trust the product to do its job, you won't actually implement its logic.
I'm not saying it's easier, but you have more control.

Application Security it's less predictable. You mainly rely on people skills to implement stuff securely. You need to make sure they don't do stupid things like storing clear text passwords in source files.

Below is a list of the most important practices which I think will help you build a security mindset. It's intended for the regular developer. When I say regular, I just mean people actually implementing, rather than all the others focused on designing, planning or managing.
They are all focused on Application Security for building (REST) APIs. At a first glance they might not seem all directly related to security. But in the end, they will minimize the probability to introduce security issues.

Majority of examples will use Java.

Standards

As mentioned above, the usage of standards is the main mechanism to build a mindset. All projects should have a set fo standards. Not everyone is a fan of standards and feel they limit people's choices and creativity.
But I think it's an easy way to get consistency, especially when having many teams working on the same platform. It allows both easier onboarding for new joiners and limit the possibility of introducing bugs or inconsistencies or argue for stupid things (spaces vs tabs ;)).
It gives you more time for meaningful discussions and debates.
Standards do not have to be very detailed, at least not in all areas. The majority of the standards should state principles and choices you've made based on existing sets of good practices.

Documentation

Key things to consider:

document your code interfaces and API contracts
define your documentation strategy:
- what is your overall documentation strategy?
- what do you put in the README.md file of the project?
- do you need to update a wider documentation?
- what tool do you use for diagrams?
- do you use lightweight architecture decision records?
- do you store the documentation along with the project in Git? or maybe use a separate tool?
- if you store it within the project, what is the recommended folder structure?

General (micro)services design guidelines

Key things to consider:

use a blueprint/template/archetype as a starting point for all your (micro)services
have the blueprint already bundled with all the common libraries, plugins, etc. and aligned to the standards
each (micro)service must start with one command
(micro)services will process data only through APIs/events; there is no back-door
(micro)services are self-contained
all (micro)services are 12 factor apps and even more

Code formatting/styling

Just choose one and apply it consistently. Auto-format before commit if possible.

Naming conventions

Just choose one and apply it consistently.

API standards

Key things to consider:

follow REST naming practices (nouns, plurals, the usual stuff) - pick one, the internet is full of guidelines, but be consistent
be consistent with the naming; this applies for everything, not only the endpoints: payload object naming, properties etc. camelCase, snake_case, kebab-case/hyphen-case etc. Again, just choose something, but be consistent
make POST, PUT, PATCH return bodies with meaningful responses
use meaningful HTTP status codes, rather than 400 for everything that goes wrong
all endpoints must return meaningful error cases
use an error catalogue (more details in the Error Handling section)
consider something like OpenAPI and consider also doing contract-first development i.e., write the OpenAPI contract initially, socialize it with your (internal) consumers; this also enables better parallel development
document your OpenAPI contracts with meaningful descriptions and examples
all (internal) APIs must use CorrelationId/TraceId headers
all API inputs must be very restrictive by default
all APIs (internal or external) must be authenticated and ideally also with authorization in place
all APIs must re-use the same common data structures; either generic ones like Address, Person, Country, etc, but also define business specific ones
all APIs (internal or external) are exposed over HTTPS only
for the relevant APIs consider returning security headers within the response like: Cache-Control: no-store, Content-Security-Policy: frame-ancestors 'none', Content-Type, Strict-Transport-Security, X-Content-Type-Options: nosniff, X-Frame-Options: DENY
internal APIs do not communicate to each others via the internet (unless this is something deliberate or required by the architecture)
do not expose management endpoints over the internet; if this is something required, use authentication
make sure all APIs are enforcing strict validation for the received requests: do not allow undocumented JSON fields, reject malformed JSONs, etc
make proper use of data types; don't have everything as a String
use enumerated values whenever possible
add length restrictions for strings and min/max for numbers
add patterns restricting input for each string
for some properties it's easier to find patterns as they have clear definitions; a country code will always follow the [A-Z]+; for others, it's a bit more difficult; a lastName property needs to be quite loose, considering all names in all languages; the recommendation is at least to prevent strange characters like the Unicode control chars, separators or symbol; a recommended pattern object is the following: ^[^\p{C}\p{Z}\p{So}]*[^\p{C}\p{so}]+[^\p{C}\p{Z}\p{So}]*$; this doesn't mean that you are now protected from any type of injection; you still need to have a good understanding where the data goes and how it is processed, but at least you won't get an emoji breaking your system

Logging standards

Key things to consider:

logging format: comma separated key=value pairs? json objects? choose something which is friendly to your tooling
always include the CorrelationId/TraceId in each log line; this will make it easier for tools to create dashboards
include information in logs that will make it easier to understand what's happening: for which entity? business area? is it success? failure?
some good practices
use an abstraction over the actual logging implementation; for example in Java: slf4j with logback as implementation
treat logging as a cross-cutting-concern; leverage Aspects; log within methods only exceptionally; this will limit people from logging sensitive stuff
don't treat logging like let's log everything and see if we needed it afterwards and dump full requests/responses; be deliberate in what you log, even when logging with debug or lower levels
more on Logging Data

Data standards

Key things to consider:

use existing ISO standards for widely known objects: Currencies, Dates, Amounts just to name a few
define business specific objects to be re-used
apply these standards for API objects, database entities and events

Processing Data

Key things toc consider:

sanitize data before processing it; this is a good sanitization regex ^[^\p{C}\p{Z}\p{So}]*[^\p{C}\p{so}]+[^\p{C}\p{Z}\p{So}]*$; it won't prevent all problems, but it will strip weird chars that can cause your system to crash
make sure that you don't transmit data from input towards internal elevated access operations like database queries, command line execution etc.; use parametrized queries for DB, be very specific around what you get and what you pass forward
favor whitelisting instead of blacklisting when you need to make decisions or when plan to restrict processing for specific input
overall favor defensive programming practices
make sure you use efficient XML parsers that are not vulnerable to XXE or similar attacks; ideally do not accept XML as input unless forced by the context

Logging Data

Key things to consider:

don't log sensitive data; if you still need it for some reason, mask/obfuscate the data; what sensitive means depends on your business and regulations
create/use a library that masks by default the most sensitive data within your platform; for example if you're processing payments, card numbers must be masked by default; you shouldn't leave this decision to each individual
consider extending the library each time new sensitive data is added; you must also balance performance when adding too much data
the logging library must also allow specific configuration so that each individual service can mask additional data without extending the library
the logging library must provide on-demand sanitization (i.e., by calling specific methods); this will make sure the same sanitization techniques are applied for all cases
the logging library must sanitize data before logging it (for example by removing all the characters matching \p{Z}\p{C}\p{So})
the logging library must also remove CR and LF characters in order to prevent CRLF injection
have a clear log archiving strategy

Storing Data

Key things to consider:

data must not be store in case you need it; you must only store data that is relevant in current context or foreseeable future
storing data introduces compliance obligations; make sure you are aware of those
some data cannot be stored in clear (one example is credit card numbers); use hardware or software HSM for encryption
don't store secrets (passwords, encryption keys, ssh keys, private keys) in version control on plain text files; use dedicated products or services for this like Vaults, HSMs
use salt and/or pepper when encrypting or hashing sensitive data; this will prevent brute-force attacks
consider building (or using) a centralized service that will tokenize sensitive data
you should tokenize any data that is under some sort of regulation: card data, PII data, etc.; use tokens instead of the actual data in all (micro)services and detokenize only when needed; this will minimize the compliance footprint and will also give better control around the data
enhance the security of the tokenization solution; do not allow external access to its APIs

Events/messaging standards

Key things to consider:

create an event catalogue so that everyone is aware of the purpose of each event
use event schemas for validation
avoid using generic events where you dump everything; you might leak sensitive information without wanting it
consider exchanging Tokens instead of the actual data for sensitive information

Configuration handling

Key things to consider:

avoid hardcoding configuration in source files
consider using centralized configuration management
segregate configuration by environment
do not store secrets (passwords, api keys, ssh keys, private keys, etc) in source files or in version control; use proper Secrets Vault systems
do not leave default credentials for any deployable unit (either cloud service, off the shelf products, or your own (micro)services)
do not put test-only code or configuration in production
don't build test only backdoors inside your (micro)service
use version control to track configuration changes
have mechanisms in place for configuration integrity checking

Error handling

Key things to consider:

consider treating exception and errors as a cross-cutting concern; leverage Aspects, use something like ControllerAdvices or similar
consider embedding the logic for the most common exceptions/errors (validation issues, resource not found, malformed messages) into a shared library; this will make the interaction between (micro)services predictable and with less friction
use an error catalogue
use error codes (e.g. MICRO-4221 - bad request due to structural validation, MICRO-4222 - bad request due to business validation)
do not leak internal state in responses; avoid passing e.getMessage(); each error returned must be deliberately created from the root cause, but without leaking internal data
use a catch-all mechanism in order to avoid leaking internal state for unexpected exceptions; you can just catch Exception in the global error handler and return a 500
return the same object for all errors to enable a consistent experience
document all error cases in your API documentation with the appropriate HTTP Status code; if you use OpenAPI, document all possible HTTP status codes, even if they return the same OpenAPI object

Branching strategy and commits

Key things to consider:

use a simple branching strategy; trunk-based, github-flow, etc.; just pick one
use meaningful names for your repos and branches
use descriptive commits; it will make it easier to trace changes in the future
use small commits to better isolate changes
use smart commits i.e., provide a link to the task from the task management system
consider using pre-commit hooks to validate the commits
do not include sensitive information in commit messages
pay attention when enabling remote access to your repos; especially when repos are hosted in cloud

Code review

Key things to consider:

do code reviews (be kind, assertive, specific, all the good stuff)
let the boring stuff to the tools and focus on the functional aspects and alignment to standards and practices
if you find the same issue repeated over and over, add it within the standards
consider using checklists, at least initially until people make it a habit on focusing on the same stuff

Tooling and 3rd party libraries

Key things to consider:

have a process in place for introducing new tooling; do a trade-off analysis and present it in a wider group to get acceptance/agreement and make sure you address wider cases
when selecting open source software pay attention to the license(s)
create a list with licenses that can be used without asking, licenses that needs to be discussed and licenses which are not allowed to be used
don't take the first (or latest) shiny tool/library/product you find; consider things like: is it stable?, is it maintained? does it have a track record?
consider using tools such as OWASP Dependency Check, License Plugin or even more complex tools such as Black Duck
create a list with the agreed tooling/libraries where people can choose from
update your dependencies frequently

Code Analysis

Key things to consider:

use one or multiple tools to analyze your code
you must have (at least) one tool focused on the general coding practices and (at least) one focused on security practices
some good tools for general code analysis (Java): Sonarqube, PMD, SpotBugs
some good tools for security code analysis: Veracode, Checkmarx, Sonarqube
you don't need to agree with all the practices that are part of the standard rule sets of these tools (although usually they are aligned with industry recommendations); you can create a subset of rules tailored to your context

Testing

Key things to consider:

automate testing at all levels: unit, integration, component, API, end-to-end, etc.
focus on negative and boundary testing, not only on happy scenarios; CATS is a good option for API testing
don't ignore failing tests, even those failing intermittently; they might hide a serious underlying issues
tests must be resilient and self-sufficient
tests must use a similar and predictable approach
tests must not depend on complicated external setup; they must either be self-sufficient by mocking dependencies, using in-memory setups or testcontainers or just depend on the (micro)service being deployed; any other steps will just complicate the setup and introduce complexity
consider adding some security testing inside the pipeline
consider mutation testing

CI/CD

Key things to consider:

include Quality Gates for the most important stuff; they must act as checkpoints and fail the build if they are not met
Quality Gates must be inline with these standards and automate the process of checking that each (micro)service is aligned
a sample CI/CD pipeline might look like this:
- compile and build
- check formatting
- run tests and check coverage
- run mutation testing
- run code analysis
- run secure code analysis
- check 3rd party libraries for vulnerabilities
- check 3rd party library licenses
- deploy
- run API tests
- run other types of testing
this might seem too much (or lengthy), but for a microservice this is quite fast
script your pipeline
don't couple the pipeline to the (micro)services
use a template pipeline for all (micro)services

Authentication and Authorisation

Key things to consider:

don't roll your own authentication and authorisation; use standards products and services
authenticate all your APIs, internal and external; just pick something proven
use separate authentication and authorisation mechanism for external and internal calls i.e., use one set of credentials/mechanism to authenticate external calls and a separate one for internal calls
credentials are always encrypted both in-flight and at-rest
use HTTPS for all APIs, internal or external
do not accept authentication credentials via HTTP GET; use only HTTP headers or HTTP POST/PUT
do not log credentials not even when debug on; have your logging library also act as catch all for credentials
make sure your authorisation and authentication mechanism allows granular control and management i.e., you can restrict number of calls per operation, revoke access, issue additional credentials, etc.
consider using a centralized Identity Provider and common libraries
use enhanced security controls for highly sensitive APIs/services (mutual TLS for APIs, MFA for access to services)
use nonces to prevent replay attacks
always design and build with the least privilege principle in mind

General Security Practices

Key things to consider:

don't ever roll your own encryption; you cannot reinvent the wheel in this space
use industry recommended algorithms: AES 256; RSA 2048+, SHA-2 512.
use TLS 1.3+ for transport security
use salt and/or pepper when encrypting or hashing sensitive data; this will prevent brute-force attacks
check your programming language practices for dealing with sensitive information; for example in Java you must use byte[] rather than String to handle password, card numbers, social security numbers, etc.; you must minimize the time the data stays in memory and clear the objcts after use

Quality attributes

As we've seen above, SDLC standards and practices are not always directly related to security. Same applies for quality attributes.
Shortcomings in current design and approach can cause your application to go down, even if it is not caused by a true security problem.

Key things to consider for Performance:

use pooling for connection to expensive resources like DB, APIs, etc.
use thread pools
use caching
use proper collections when manipulating data
use parallel programming if applicable
make sure you understand how your ORM generates queries
avoid loading big resources in memory, use data streams
baseline your performance per (micro)service instance so that you know when to scale
do regular load and performance testing

Key things to consider for Resilience:

use circuit breakers, retries, timeouts, rate-limiting
have clear fallback strategies when dependent APIs are not available
some great resources on the topic: Resilient Systems Part 1 and Resilient System Part 2
make all APIs Idempotent
don't store state within one (micro)service instance; use a distributed cache for that

Key things to consider for Availability and Scalability:

don't make your (micro)services design limit horizontal scaling
plan for failure, have automated mechanisms in place for auto-scaling based on load
consider sharding, read-only replicas
use multi-region deployments

Key things to consider for Observability and Monitoring:

all (micro)services must expose health endpoints covering both application and the underlying container
the health endpoint must return information about all its dependencies: db, encryption service, APIs it connects to, event bus, etc.
leverage the standardized logging to create meaningful operational dashboards

Automate

Automate everything. Automation makes it predictable and consistent.
The CI/CD pipeline should be the place where you automate all checks that will assess your (micro)service from a quality perspective.
Tools like Semgrep can bring automation with less effort for standards not obviously suited for automation.

Conclusion

This isn't a final list, it's more like a brain dump. It's a starting point for building a security mindset. Once you apply all these, you are ready to deep dive.
Applying all these practices won't give you only security benefits, but also more structure and alignment.
This is particularly important in systems developing too fast, either brand new or legacy.
You don't need to go with all these from day 1, it might seem overwhelming especially if you are not used to following common standards and think it will limit your options.
But maybe you can try it for a while and see what happens!

DEV Community

An incomplete list of practices to improve security of your (micro)services

Where is Security focused

Tackling Security

Standards

Documentation

General (micro)services design guidelines

Code formatting/styling

Naming conventions

API standards

Logging standards

Data standards

Processing Data

Logging Data

Storing Data

Events/messaging standards

Configuration handling

Error handling

Branching strategy and commits

Code review

Tooling and 3rd party libraries

Code Analysis

Testing

CI/CD

Authentication and Authorisation

General Security Practices

Quality attributes

Automate

Conclusion

Top comments (0)