Automate Your Log Analysis with This Simple Bash Script

#devops #bash #automation #scripting

If you've ever spent too much time staring at a wall of text in a log file, trying to figure out why your application is crashing, this post is for you.

While there are powerful tools like ELK or Datadog for enterprise scale logging, sometimes you just need something quick, local, and "no nonsense" to parse a log file on your machine or a remote server.

Today, I'm sharing a simple Log Analyser script I built in Bash that categorizes logs and surfaces the most frequent issues automatically.

The Problem

Scanning logs manually is tedious and error prone. You might grep "ERROR" file.log and then realize you have 500 lines of the same database connection error, hiding a single "File not found" error that is the actual root cause.

The Solution: `log_analyser.sh`

I created a lightweight script that does three things:

Counts total occurrences of INFO, WARNING, and ERROR levels.
Aggregates unique messages so you can see which specific error is occurring most often.
Displays a clean summary.

The Code

Here is the heart of the script:

#!/bin/bash

# ... basic validation logic ...

LOG_LEVELS=("ERROR" "WARNING" "INFO")

for LEVEL in "${LOG_LEVELS[@]}"; do
  COUNT=$(grep -ic "$LEVEL" "$LOG_FILE")
  echo -e "
[$LEVEL] Total occurrences: $COUNT"

  if [ "$COUNT" -gt 0 ]; then
    echo "Top unique $LEVEL messages:"
    grep -i "$LEVEL" "$LOG_FILE" | awk '{$1=$2=$3=""; print $0}' | sed 's/^[    ]*//' | sort | uniq -c | sort -rn | head -n 5
  fi
done

How It Works: The "Power Pipeline"

The most interesting part of this script is the command pipeline used to extract unique messages:

grep -i "$LEVEL" "$LOG_FILE" | awk '{$1=$2=$3=""; print $0}' | sed 's/^[ ]*//' | sort | uniq -c | sort -rn | head -n 5

Let's break it down:

grep -i "$LEVEL": Finds the log level (case-insensitive).
awk '{$1=$2=$3=""; print $0}': This is a neat trick! It clears the first three fields (usually Date, Time, and Level) so we only look at the actual message content.
sed 's/^[ ]*//': Trims the leading whitespace left behind by awk.
sort | uniq -c: Sorts the messages and then counts how many times each unique message appears.
sort -rn: Sorts the results numerically in reverse order (highest count first).
head -n 5: Only shows us the top 5 most frequent messages.

How to Use It

Clone the script (or copy it from above).
Make it executable: chmod +x log_analyser.sh
Run it against any log file: ./log_analyser.sh sample.log

Example Output

--- Log Analysis Summary for: sample.log ---

[ERROR] Total occurrences: 4
Top unique ERROR messages:
   3 Database connection failed.
   1 File not found: /var/www/html/index.php

[WARNING] Total occurrences: 2
Top unique WARNING messages:
   1 Memory usage high.
   1 Disk usage at 85%.

...

Conclusion

Bash is incredibly powerful for these kinds of "glue" tasks. By combining a few standard Unix tools, we've created a tool that saves minutes of manual work every time we debug a service.

What are your favorite "one liner" Bash tricks for log analysis? Let me know in the comments!

This project was a great exercise in learning Bash best practices and command line data processing. You can find the full project on my https://github.com/alanvarghese-dev/Bash_Scripting

DEV Community

Automate Your Log Analysis with This Simple Bash Script

The Problem