DEV Community

Tomás Garcia
Tomás Garcia

Posted on • Edited on

Bash script to log and restart docker container based on cpu usage

I wrote this post to share with you one job experience that I had to live with recently, one problem I had and my temporal solution to this problem.
I'm new in this tech world, I will be grateful if any of you have some improvement recommendations and very pleased if this post it's useful for someone else.
A few mounths ago in my company we discovered that some Docker container was having problems with CPU usage. Out of nowhere the CPU usage of that container was increasing abruptly. So while the dev team was searching for the code error, I implemented a temporary solution. I made one script to log all cpu usage every 5 seconds:

#!/bin/bash
logs=/var/log/process_name.log
container_name=container_name

while :
do
  # Get a variable with the cpu usage for a specific container
  var=`docker stats --no-stream --format "{{.CPUPerc}}" $container_name`
  length=${#var}

  if (( $length==0 )); then
     echo "Container ${container_name} does not exist"
     echo "$(date +'%d-%m-%Y %H:%M') | Container $container_name does not exist" >> $logs
  else
    # CPU usage in number
    percent="${var[@]::-4}"

    echo "Actual cpu usage: ${percent}"

    # Save actual CPU usage in file
    echo "$(date +'%d-%m-%Y %H:%M') | ${percent}" >> $logs
  fi
  sleep 5

Enter fullscreen mode Exit fullscreen mode

After that I created a supervisor config to run this process:

[program:process]
command=/opt/scripts/script.sh
autostart=true
autorestart=true
stderr_logfile=/var/log/process.err.log
stdout_logfile=/var/log/process.err.log
Enter fullscreen mode Exit fullscreen mode

Then I wrote a script to restart the problematic container, based on the logs of the previous script:

#!/bin/bash
container_name=container_name
logs_evaluated_lines=5
logs=/var/log/process_name.log
max_cpu=90

while :
do
    # Lines in file
    num=$(wc -l < $logs)

    counter=0

    # For 'logs_evaluated_lines' lines in logs increase counter if cpu is greater than 100%
    for ((index=$num;index>=$num-$logs_evaluated_lines+1;index--))
    do
    value=$(sed "${index}q;d" $logs)
    percent=$(echo $value | cut -c 20-)
    #echo $percent
    if (( $percent >= max_cpu )); then
    #    echo 'mayor'
        counter=$((counter+1))
    #  else
    #    echo 'menor'
    fi
    done

    echo "$(date +'%d-%m-%Y %H:%M') | Logs up to 100%: ${counter}"
    echo "$(date +'%d-%m-%Y %H:%M') | Logs lines analyzed: ${logs_evaluated_lines}"
    if (( $counter == $logs_evaluated_lines )); then
        echo "$(date +'%d-%m-%Y %H:%M') | CPU Full usage";
        echo "$(date +'%d-%m-%Y %H:%M') | Restarting Container"
        docker restart $container_name
        echo "$(date +'%d-%m-%Y %H:%M') | Container Restarted"
        echo "$(date +'%d-%m-%Y %H:%M') | Container Restarted" >> $logs
    else
        echo "$(date +'%d-%m-%Y %H:%M') | CPU Usage OK"
    fi
    echo "$(date +'%d-%m-%Y %H:%M') |"
    sleep 5
done
Enter fullscreen mode Exit fullscreen mode

This script evaluate 'logs_evaluated_lines' lines in log and restarts the container if the count is upper 'max_cpu' variable

Top comments (1)

Collapse
 
semotech profile image
SemoTech

My thanks @tomasggarcia , this is exactly what I needed following the discovery that one of my Docker containers (homebridge running on Ubuntu server) was using 300% CPU, and raising the machine temperature to 140deg! Don't really want to monitor the stats live all the time, or worry about it, so needed something to restart the container if that happened again.

Could you be so kind as to advise what each of the files should be called, what permissions they need, and where they need to go on an Ubuntu 20.04 server? Also, are any crons needed, and if so how should they be setup? Much appreciated in advance.