In this part, you will find real experience in automated security testing as we have it at Cossack Labs. As a security solutions company we take these discoveries and “healed wounds” as highly valuable 🔐
At Cossack Labs, working to build security products ourselves, we carry out automated security testing wherever we can, using both ready-made third-party testing suites and our own heavily customised solutions.
For instance, we use automated testing of Themis, multi-platform cryptographic services library, which is a foundation of many our products. During development, we check every PR with CircleCI, Bitrise, and GitHub Actions, which run a set of tests on the whole code base across 13 languages.
📎In this case, every PR gets checked by:
⚙️ clang-tidy as a linter and static analysis tool to diagnose and fix common mistakes and bugs,
⚙️ ASan, MemSan, and UBSan as dynamic analysis tools for detection of memory corruption, memory leaks, use of uninitialised memory, and undefined behaviour—all known to be a source of the most disastrous bugs,
⚙️ Valgrind as a dynamic analysis tool forming another defence line against memory leaks and memory management problems,
⚙️ american fuzzy lop as fuzzer to exercise uncommon code paths and trigger weird behaviour where you least expect it,
⚙️ several cryptography-specific tests (like NIST statistical test suite and cross-compatibility tests) to ensure that no errors in cryptographic dependencies and random number generation might creep into this exact build.
We use tools like ASan and Valgrind for testing the very core of Themis library written in C. But since there is also a number of wrappers written in other popular languages, we use standard language-specific means for them.
For example, Rust code is inspected by Clippy, Go code is vetted by a number of linters, and of course each language wrapper has an extensive test suite which is often larger than the code it verifies. Testing mobile wrappers for iOS and Android has its own difficulties: Android emulator needs 5–10 minutes just to start up, and iOS testing requires using the macOS.
You can go into crypto-specific tests and the approach towards running all of that in CircleCI, placed in the Themis GitHub repository.
Having worked on products which underwent extensive security testing for many years, we see how frequently people get uncomfortable with running an additional test infrastructure to ensure security.
We’re still strong believers that even though ensuring code security always brings usability trade-offs, it’s worth it.
Security tests are not your regular unit tests or functional tests, they take longer to run, sometimes they take considerable time to accumulate the data. Depending on the type and criticality of development, you might be tempted to run them in parallel and non-blocking to the main test pipeline. But this is laziness at its worst because when you’re (finally) getting serious about security, you must only run them as blocking tests.
For example, when we added a new passphrase API to Themis, this slowed down the full unit-test suite by over a minute because passphrase-based key derivation functions are purposefully slow for security reasons. We even added a test to ensure that the API is slow enough.
Slower testing process sometimes leads to detection of dependency failures and vulnerabilities not directly related to your code. For instance, we had to deal with a non-transparent change of compiler’s behaviour towards external dependencies (C imports in Go that changed significantly from Go 1.3 to Go 1.6), which could’ve manifested in a serious security issue if testing and benchmarking didn’t include volume tests on input. Hadn’t we had tested this beforehand, this would be a ticking time-bomb, even though Go is known to be an extremely safe language when it comes to memory, and such issues should never have emerged.
An important aspect of automated testing is the engine which performs the tests. As we mentioned, we use several CI services for our mature products to ensure wide coverage of platforms.
Testing matrix with various OSes of different versions with diverse toolchains might uncover issues which surface only in a particular environment. This is especially important on mobile platforms like Android and iOS—as you have less control over user environment.
While using public CI services is great for public repositories with stable products, it may not be the most efficient tool for sophisticated products in active research and development (e.g. constant hot mess),
where test suites evolve with the product and test scenarios change overnight.
Want to learn more about maintaining and testing cryptographic libraries? Check video or slides from talk by Anastasiia Voitova “Maintaining cryptographic library for 12 languages”.⇩⇩⇩
For that, internally we use the BuildBot continuous integration framework, which provides very high flexibility of scenarios: blocking, non-blocking, containers, types of deployments, types of artifacts gathered—everything is extremely configurable (and very laborious). While regular tests can be written to fit the system, sometimes involving third-party instrumentation to test security properties requires complex integration scenarios, and having a flexible CI framework helps leave no stone unturned.
Similarly to code reviews in traditional software development, having a human eye on code changes is a must: some behaviours just cannot be detected automatically. Having a third party review of a major release is a crucial practice security-wise. But security testing can spot both regressions and obvious flaws, and looking at the incidents and security flaws that ended up in disasters, that’s still something many developers could start doing.
For the larger part of our careers, security issues are the first that come to mind when we develop and test software. Which is kind of backwards as compared to the non-security related developer community that is focused on shipping fast (always), consistent (sometimes), and reliable (rarely).
With the news about yet another breach, the maxim “everything will be broken” rings as true as before, with no chance of changes for the better in the foreseeable future due to the insulting carelessness ubiquitously practised security-wise. And apart from usability trade-offs and plain sloppiness, it’s always a question of knowing how and what should be tested.
Well, now you’ve got a few reference points. We hope it was useful for you. Should you need any further support on these and other data security engineering or sensitive data risk management issues, look through our products and postings, and feel free to drop us a line for a consultation.