So, we know what SBOM is and why it is more and more important that we should generate it. It is time to go into some tools which can help us to create and validate reports.
I have not very deep experience with SBOM tools. However, what I heard and learned so far, is that this area is still not fully covered. Besides standards and approaches, if we want to have operational pipeline with all its elements, we need to play with existing stuff a little more than we would like.
The first tool we discuss here, is ScanCode.io.
The tool
What is ScanCode?
Simply speaking, it is a scanner and code analyser tool which allows to scan codebase for origins and licenses. In general, ScanCode collects information about components and their licenses during the Software Composition Analysis (SCA) process.
Installation
ScanCode has a few option of installation, we cover here container version. When we clone the repository
$ git clone https://github.com/nexB/scancode.io.git
we should create the .env
file, using prepared Makefile
$ make envfile
This will create the file and secret. In my case it looks like this
$ cat .env
SECRET_KEY="Rxg+cZJQDOdinwXAPc/D2d2QyEODpl5xz4NJp5f/aSSDmf106a"
We are ready to build and run the tool.
In fact, we have docker-compose template in our disposal. This template contains a few elements
- db (postgres)
- redis
- web (app)
- worker
- nginx
Obviously, Nginx is our entry point. Behind it the app is working with workers (where the actual scans are executed). On the end we have Redis and PotgreSQL to keep data.
Build and run
This part for all of us who knows docker is simple and clear. First, we need to build the containers
$ docker-compose build
In fact, only app container is processed in this step. However, the build process takes a lot of time. And the image is quite huge
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
scancodeio_web latest a6a6380ef6f8 34 seconds ago 2.31GB
scancodeio_worker latest a6a6380ef6f8 34 seconds ago 2.31GB
When build is finished, we are ready to run our stack
$ docker-compose up -d
And here we have some issue (well, maybe "issue" is a little bit too big word). Compose exposes ports 80 and 443 for Nginx service, but the Nginx server is configured for port 80 only.
GUI
When the stack starts, we can go to GUI console in the browser, by entering http://localhost:80
.
The console is simplistic but nice, clear and comfortable to work with.
Setup the project
It is time to setup our first project. On the beginning we will setup simple scan. Many of us use python container image, correct? Let's see, what we can learn about python:latest
!
After we click "New project"
button, on the right side we can select the project type.
These are predefined. We can create our own too.
As we wish to scan docker image, we select docker
and now it is time to configure the project. Configuration is very simple.
I entered three values:
- Project's name
- url
docker://python.latest
(this will connect to dockerhub and collect the proper image) - pipeline -
docker
in this case.
And that's it!
Let's click Create
.
Processing
On main screen we can observe the progress of project's execution. In my case I had to refresh the view manually, but it is not an issue.
As this task took a lot of time, I created second project, this time for Python based on Alpine Linux. This execution was queued. Can we run these tasks simultaneously? Well, yes, it is explicit config setting.
Execution of the first run took more than 1 hour on my machine. It is a lot.
So here we see some downside of this process - it can be very time-consuming. Therefore if we want to design this process as part of CI/CD pipeline, we need to be careful and aware of potential time needed for execution.
Report
Now the juice. When I generated the report, I started to dig deeper and looking around and after long time (and I mean it!) I realised "Hey! You write an article. Write the article, then!" Reports generated by ScanCode are simply great.
Ok, let's navigate through them, shall we?
Scan's summary
First, let's click on the green Success
button in the row where our scan is. This report shows some, let's say, meta information about scan process.
We see status of the run, info about the task, execution time, dates, resources. Quite useful summary.
Here we see more details about execution itself. What steps were performed, how long these steps took, etc.
Scan's details
And here is the core of our report (it is not a SBOM yet!).
On the top of the screen we can see UUID and work directory information. Below we have some numbers about the execution and buttons to download the raport in different formats. And after that we have information about input artifact, in this case python:latest
. Project data shows a lot of information about docker image, with layers, descriptions, commands etc. Very useful.
Next section shows a lot of visualised data
Information about packages.
Dependencies information. Here we can learn everything about dependencies discovered during the scan.
Finally, codebase resources. As the picture above presents, multiple scopes are available for us to analyze.
Let's go into some of details now.
In codebase resources in HOLDER category we see some of holders, and there is Mr. Vinay Sajip. Let's see, what is his contribution here.
Hoover the proper element and click
Here we have details about every finding.
Now things go even more interesting. Click on any Path element and...
We go into the file! Let's find out information about licenses. Click Licenses
in Detected values
list
Ok, let's look on something else now!
Select Other
in PACKAGE LICENSE EXPRESSION
We can check every individual package and learn what license type it uses.
Another example, Go to PACKAGE TYPE and click pypi
As we can see, the information detected by ScanCode.io is very detailed. We took one of the most popular images out there and we are able to depict it to the smallest elements.
Download data
Finally, we can download reports in different formats.
Click one of the buttons from picture below
And it will be simply downloaded :)
Report can be downloaded as JSON od Excel file. Two more options format the report with the standards restrictions - one for SPDX (Software Package Data eXchange) and second for CycloneDX standard (and these are our SBOMs).
We scanned docker images so far. I also did the test for code bundle. So, I have my very old python Alexa Skill script for AWS Lambda. It contains a few dependencies, let's take a look on the requirements.txt
file
ask-sdk-core==1.9.0
Pillow
That's it. In the function code I import a few libraries
import logging
import json
import requests
from ask_sdk_core.skill_builder import SkillBuilder
from ask_sdk_core.dispatch_components import AbstractRequestHandler
from ask_sdk_core.dispatch_components import AbstractExceptionHandler
from ask_sdk_core.utils import is_request_type, is_intent_name
from ask_sdk_core.handler_input import HandlerInput
from ask_sdk_model.ui import SimpleCard
from ask_sdk_model import Response
I created a bundle and scanned it with these settings
So, I put the zip directly into the GUI. Of course, I could send it to my artefact storage and scan it from there (similarly like we did with Docker images).
After the scan is finished, I have a very interesting report
Let's take a look on dependencies
And packages
So, this was the quick review of ScanCode.io. The tool is very easy to use, very easy to maintain and, what is very important for the teams, very easy to start with.
As this operation - create SBOM - might be obligatory very soon, it is a good idea to start preparing ourselves for it.
However, there is one thing, which I wasn't able to successfully run. Vulnerability scan
But we will handle it in the next episode.
It is worth to mention that ScanCode provides API. It means that it can serve in the security pipelines and provide its functionality on demand without delays needed for provisioning. API functionality gives the needed flexibility and scalability needed to act as a important tools in the Organization's governance and compliance.
Top comments (0)