DEV Community

Alexey Melezhik
Alexey Melezhik

Posted on

Ssh-bulk-check - Super flexible check of group of nodes by ssh.

Sometimes we need to monitor that a group of hosts are in right state. Writting tests take tame. Existing tools like goss or inspec are good but they lack of flexibility. There is always something that are not covered by provided API.

New Sparrow6 plugin ssh-bulk-check allows to write check scripts to validate states of group of ssh hosts in unlimitted manner.

It's extremely flexible and comprehensive because relies on Sparrow6 Task Check DSL and plain Bash scripts that are able to cover almost any use cases.

Let me show how it works.

Install Sparrow6

zef install https://github.com/melezhik/Sparrow6.git

Create Sparrow6 Repository

export SP6_REPO=file:///tmp/repo
mkdir -p /tmp/repo
s6 --repo_init /tmp/repo
git clone https://github.com/melezhik/sparrow-plugins.git
cd sparrow-plugins && find -maxdepth 2 -mindepth 2 -name sparrow.json -execdir s6 --upload \;

Ssh hosts state

Say, we have a group of nodes, where every node has to have the same state:

  • directory /var/data exists

  • size of /tmp is no more then 1GB

  • nginx runs with no more 2 workers.

This is quite imaginative example but you've probably caught the idea.

Let's create shell script that run test commands first:

Cmd.sh

mkdir files

nano files/cmd.sh

echo '==='

echo "check /var/data dir"
  ls -d /var/data && echo "/var/data is a directory"
echo "end check"

echo '==='

echo "check /tmp/ dir size"
  sudo du -sh /tmp/
echo "end check"


echo '==='

echo "check if nginx is alive"
  ps uax| grep nginx| grep -v grep
echo "end check"

echo '==='

Enter fullscreen mode Exit fullscreen mode

Now let's define check rules, that analyze output of cmd.sh script. I use Sparrow6 TaskCheck DSL with range expressions, so to tell one check from another:

State.check

between: { 'check /var/data dir' } { end \s+ check }

  /var/data is a directory

end:

note: ===

between: { '/tmp/ dir size' } { end \s+ check }

  regexp: ^^ \d+(\w+) \s+ '/tmp/'

  generator: <<HERE
  !perl

    if (@{matched()}){
      my $order = capture()->[0];
      print "assert: ", ( $order eq 'G' ? 0 : 1 ), " the size of /tmp dir is less
      then 1 GB\n";
    }

  HERE

end:

note: ===

between: { 'check if nginx is alive' } { end \s+ check }

  /usr/sbin/nginx -g daemon on; master_process on;

  regexp: ^^ 'www-data' \s+ .* \s+ worker \s+ process $$

  generator: <<HERE
  !perl

    if (my $cnt = @{matched()}){
      print "assert: ", ( $cnt <= 2 ? 1 : 0  ), " no more 2 nginx worker
      launched\n";
    }

  HERE

end:

Enter fullscreen mode Exit fullscreen mode

We are almost set, now let's run all our checks against ssh hosts, we're gonna use Sparrowdo Task runner in --localhost mode, because we run tests from localhost using ssh/sshpass utilities:

Sparrowdo scenario

sparrowfile

#!perl6

task-run "check my hosts", "ssh-bulk-check", %(
  cmd => "files/cmd.sh",
  state => "files/state.check",
  hosts => [ "192.168.0.1" ],
);
Enter fullscreen mode Exit fullscreen mode

To run test just say:

sparrowdo --localhost

Here is the result:

20:01:46 04/29/2019 [check my hosts] check host [192.168.0.1]
20:01:46 04/29/2019 [check my hosts] ===
20:01:46 04/29/2019 [check my hosts] check /var/data dir
20:01:46 04/29/2019 [check my hosts] /var/data
20:01:46 04/29/2019 [check my hosts] /var/data is a directory
20:01:46 04/29/2019 [check my hosts] end check
20:01:46 04/29/2019 [check my hosts] ===
20:01:46 04/29/2019 [check my hosts] check /tmp/ dir size
20:01:46 04/29/2019 [check my hosts] 40K        /tmp/
20:01:46 04/29/2019 [check my hosts] end check
20:01:46 04/29/2019 [check my hosts] ===
20:01:46 04/29/2019 [check my hosts] check if nginx is alive
20:01:46 04/29/2019 [check my hosts] root      1243  0.0  0.0 140628  1500 ?        Ss   18:32   0:00 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
20:01:46 04/29/2019 [check my hosts] www-data  1244  0.0  0.0 143300  6264 ?        S    18:32   0:00 nginx: worker process
20:01:46 04/29/2019 [check my hosts] www-data  1245  0.0  0.0 143300  6264 ?        S    18:32   0:00 nginx: worker process
20:01:46 04/29/2019 [check my hosts] end check
20:01:46 04/29/2019 [check my hosts] ===
20:01:46 04/29/2019 [check my hosts] end check host [192.168.0.1]
[task check] ====================================================
[task check] check results
[task check] ====================================================
[task check] stdout match (r) </var/data is a directory> True
[task check] ===
[task check] stdout match (r) <^^ \d+(\w+) \s+ '/tmp/'> True
[task check] <the size of /tmp dir is less then 1 GB> True
[task check] ===
[task check] stdout match (r) </usr/sbin/nginx -g daemon on; master_process on;> True
[task check] stdout match (r) <^^ 'www-data' \s+ .* \s+ worker \s+ process $$> True
[task check] <no more 2 nginx worker launched> True
Enter fullscreen mode Exit fullscreen mode

For this example I use only one host - 192.168.0.1 but we can add as much hosts as we need, the plugin will check them all in one transaction.

It's dead easy to write almost any check with that plugin. Please share your use cases and I'd be glad to help you how to use ssh-bulk-check for them


Top comments (0)