DEV Community

Jaro
Jaro

Posted on

Advanced bash scripting with awk

Overview

AWK is an interpreted programming language. It is very powerful and specially designed for text processing. Its name is derived from the family names of its authors − Alfred Aho, Peter Weinberger, and Brian Kernighan.
https://www.tutorialspoint.com/awk/awk_overview.htm

The version of AWK that GNU/Linux distributes is written and maintained by the Free Software Foundation (FSF); it is often referred to as GNU AWK.

AWK is a very powerful tool for

  • Text processing
  • Producing formatted text reports
  • Performing arithmetic operations
  • Performing string operations, and many more

AWK breaks each line of input into fields where a field is a string of characters separated by a whitespace. Don't worry though, this is also a configurable parameter. You can actually also manage what delimiter you want. AWK has its specific strength when having to handle structured text files, tables, or chunk structured data.
First of all, let's start with a simple example where we format the output of a string:

echo tomatoe potatoe | awk '{print $1}'
# tomatoe 

echo tomatoe potatoe | awk '{print $2}'
# potatoe

awk '{print $3}' $filename
# Prints field #3 of file $filename to stdout.

awk '{print $1 $5 $6}' $filename
# Prints fields #1, #5, and #6 of file $filename.
Enter fullscreen mode Exit fullscreen mode

That is one of the core features of AWK, the print command. Now we just played with orders in how matches occur in the string. What is actually possible with AWK is manipulating patterns. You could for exemple print a line of a file where the pattern you requested matched. Or for example print all the possible http error code 500 in a web server log.

awk '$9 == 500 { print $0}' /var/log/httpd/access.log
Enter fullscreen mode Exit fullscreen mode

Of course, here the $9 refers to the position of where the error code appears.
The part actually outside of the curly brackets represents the pattern we are looking for and the part inside represents the action to be executed by AWK. You can use all of the possible comparison operators that are also included in C language such as : ==, !=, <, >, <=, >=, ?:

If you do not pass any pattern the action applies to all lines. If no action is given, the entire line gets printed.

AWK is actually a programming language which means it can also perform arithmetic operations.
Let's say we have a file with a column of numbers and we want to get the sum of all the numbers in that column. Keep in mind this column is simply placed at position 1 of that file such as it resembles to this:

1
2
3
4
5
Enter fullscreen mode Exit fullscreen mode

In order to gather the sum we use operator brackets such as :

awk '{total += $1} END {print total}' file.txt
Enter fullscreen mode Exit fullscreen mode

What happens here is kind of like a loop, where awk gets all the first position words on each line, declares at first a variable called total, sums those words or here numbers into that variable and finally when this operation is over, prints that variable.

Now let's try to use all of that into some more useful stuff such as getting your CPU temperatures from the sensors utility and stripping out the "+" and "°C" from it.
Getting the temperature is usually done by typing sensors but this might vary depending on what distro/OS you are using.

$ sensors
dell_smm-virtual-0
Adapter: Virtual device
Processor Fan: 2706 RPM
CPU:            +44.0°C  
Ambient:        +37.0°C  
SODIMM:         +36.0°C  

$ sensors | grep CPU
CPU:            +44.0°C  

$ sensors | grep CPU | awk '{printf "%d\n", $2}'
44
Enter fullscreen mode Exit fullscreen mode

Now let's try to do some more programming language logic and try to implement our if-else logic. Below is an example that pipes only the data and then it changes a column value to a string based on a condition:

$ cat pi_data.txt
time temp wave(ft) comments
---- ---- -------- --------
10:00 24   3       No wind
12:00 26   5       High winds
14:00 25   4       wind calming down

$ # Show time and small or medium for wave size
$ cat pi_data.txt | \
>   awk '{if ( $1 ~ /[0-9]/ ) print $0'} | \
>   awk '{if ($3 < 4) {print $1 "\t small"} else { print $1 "\t medium"} }'
10:00    small
12:00    medium
14:00    medium
Enter fullscreen mode Exit fullscreen mode

A single AWK command to adjust the title and then change the data:

$ cat pi_data.txt | \
>   awk '{if ( $1 ~ /[0-9]/ ) \
>            { \
>               {if ($3 < 4) {print $1 "\t small"} else { print $1 "\t medium"} } \
>        } else { print $1 "\t " $3} \
>        }'
time     wave(ft)
----     --------
10:00    small
12:00    medium
14:00    medium
Enter fullscreen mode Exit fullscreen mode

And finally and more importantly, killing tasks, one of the most useful tool in existence.

#!/bin/bash
#
# stop_task.sh - stop a task
#
task1="edublocks"

echo "Stopping $task1..."
ps -e | grep -E $task1 | \
 awk '{print $1}' | xargs sudo kill -9 1>&-
Enter fullscreen mode Exit fullscreen mode

Final comments

I’ve found that learning a little bit of AWK has really paid off.

AWK supports a lot of functionality and it can be used to create full on scripting applications with user inputs, file I/O, math functions and shell commands, but despite all this I’ll stick to Python if things get complex.

Oldest comments (0)