DEV Community

loading...
Cover image for Newbie at bash scripting? Here's some advice

Newbie at bash scripting? Here's some advice

pencillr profile image Richard Lenkovits ・5 min read

When I started using bash for scripting I couldn't wrap my head around it first. I'm relatively young and the reason behind my confusion was probably that. Coming from python at that time, it felt as if every time I wanted to do something, I needed some special glyph or incantation to make it happen. Bash is powerful, it is super permissive, and because of that, it is extremely easy to do something stupid with it.

Just to be clear, this blogpost is more about good conventions in bash, not it's syntax or how to do basic stuff with it.

This is how you start a bash script

I wanted to talk about the set builtin, but I realized that's not how you start a bash script. This is how:

#!/usr/bin/env bash

This is called a shebang. By making this the first line, you'll make sure that you will run with the default bash, which is the first reference for it in your $PATH environment variable. It's a shortcut, but it's a good convention to follow. (Unless security is a concern, and you are afraid that someone will tamper with your $PATH environment variable.)

Now really. This is how you start a bash script

#!/usr/bin/env bash
set -euo pipefail

The set builtin is a pretty complex and useful element in our bash toolkit. Just to make my point I'll oversimplify it a bit, but here's an explanation: set allows you to configure how bash behaves in certain key scenarios.
Let's go through it very briefly:

  • set -e option will cause a bash script to exit immediately when a command fails. This is generally a vast improvement upon the default behavior where the script just ignores the failing command and continues with the next line.
  • set -u causes the bash shell to treat unset variables as an error and exit immediately.
  • set -o pipefail sets the exit code of a pipeline to that of the rightmost command to exit with a non-zero status, or to zero if all commands of the pipeline exit successfully. Actually set -e itself is kind of set in vain without this option, as it will not exit the script if a pipeline of commands have a rouge failure somewhere in between.

Ok I did the set thing, and it's killing me. How to deal with it?

Pretty sure you will end up having trouble because of set -euo pipefail one way or another. For example, you may decide to provide some parameter optionally, and then set -u will kick your ass. That's when you do:

if [[ -z "${oxygen:-}" ]]; then
    echo "Huston we have a problem!"
fi

This conditional checks if the given parameter oxygen is unset (or an empty string). The emphasis is on the :- expression. Without that, due to set -u the script will fail because it will not be able to do the parameter expansion. Using :- the expression will be left unset, and we will pass for the error. By the way, this is how you add default parameters. Whatever is behind the :- will be used as default if your parameter is unset. Like this: "${oxigen:-nitrogen}"

Another usual problem is when you expect something to fail or you want to handle it explicitly. Naturally set -eo pipefail will exit your script, if anything dares to exit with a non zero exit status. To handle this you have #pipepipetrue:

echo 'Here comes the rough part'
my_fate=$("${dark_cellar}"/russian_roulette.py --load "${bullet}" || :)
echo 'Keep going, whatever'

The expression || : will blot out your error and let your script run through.

I can do stuff. But am I doing it right?

As we said, there are many ways to do something in bash. Luckily the online community is great, and it's easy to find the one-liner that does the trick you want. But is that a safe solution? One thing I learned: there is always a scenario that you did not think about.
It is good when you have a friend who is a Linux pro and catches all your mistakes. But if you don't have one, you can still have a good static analyzer!

Meet shellcheck.

If you use bash, use shellcheck too. It warns you about errors and common bad practices which could result in unseen consequences. It's easy to set up, with apt just like that:

sudo apt update
sudo apt-get install shellcheck

You can simply run it for any of your scripts as shellcheck myscript.sh and it will set you straight!

Permissive doesn't mean that you shouldn't modularise

If you end up doing DevOps stuff at some company you will probably notice that every bash script is basically a list of operations in order. Not like a python script where the community practice pushes you to create functions, and make sure Single Responsibility Principle is kept. But who said that you can't do the same in bash? See:

function get_to_the_choppa() {
    local mate=$1
    echo "${mate}! Get to the choppa!"
}

get_to_the_choppa "Jack"

As usually, there are some scenarios when bash screws you over if you don't know how it behaves, so let's see the most common ones:

  • use local variables. Everything declared in a function is available beyond the scope of the function. To avoid this you must explicitly declare your variables to be local.
  • choose a convention and be consistent. You can declare a function in several ways. Make sure to stick with one.
function take_pills() {
    echo "Get better"
}
function dink_tea {
    echo "Get better"
}
rest_in_bed() {
    echo "Get better"
}
  • you can return with echo. Bash has a return statement but you can only specify the functions exit status with it. To acquire values from your functions the solution could be to set global variables in them without using the local keyword, but better to be explicit! You can just capture what your function echoes.
function interrogate() {
    echo "It was Hank!"
}
who_to_blame=$(interrogate)

Get to know Linux

If you are a newbie at bash, it's pretty sure Linux is new to you as well. I'm not the best person to explain all the basics about Linux, but I can share an amazing and free resource, that I used as well when I started.
If there are still some topics that puzzle you, you can read all about them at linuxjourney. It has simple and easy-to-grasp descriptions about every basic and advanced concept, and it has a great clean UI that pleases the eye.
This page is so simple and amazing, that it could teach a monkey to work with Linux. Deffinietly check it out!

Last, two advice

Take two advice from me, and apply them when something doesn't work as expected. Back then I found these steps so important, that I wrote them on post-its and stuck them on the bottom of my monitor. Here they are:

Step One: Read that code again.

Something is not working right? Don't panic. I'll tell you what will happen: Some time will pass and you will find the mistake you made. Probably on the tip of your nose. You may think that you did everything fine but some components that you use are conflicting? Go to step two.

Step Two: It's almost always you.

You may think that some components are not compatible. Or maybe the version that you use does not support some use-cases. But usually no. Usually it was you who messed it up. Go back to step One.

Discussion (6)

pic
Editor guide
Collapse
lietux profile image
Janne "Lietu" Enberg

When writing scripts for automation and while testing them it's also a good idea to include x for your set -> set -exuo pipefail

This makes it output the command being executed, so e.g. when variable contents differ from your expectation or similar you can see it from the output, and debugging time is significantly reduced.

Collapse
benaryorg profile image
#benaryorg

Modifying the script to make it output debug info is a little.… meh.
You wouldn't want to re-compile nginx to get INFO instead of WARNING output in your error log, right? ;-)

The simple way would be having your script not use -x at all, but instead just call it using bash -x your_script.sh, that honours all the options set inside the script and is an on-demand way to print the debug info.

Collapse
lietux profile image
Janne "Lietu" Enberg

People make mistakes, and I'm talking about scripts used in automation.

If your habit is to put the x on all scripts used in automation, when you have an error you can just look at the logs to have a decent idea of what's going on. If you don't, you'll first have to figure out which scripts to even put the bash -x or similar to, make that change, and then trigger another build hoping that it was the right place.

There's quite a big difference between outputting what a bash script is doing in an automation system vs. having a long running service print it's debugging output to your system logs.

Thread Thread
benaryorg profile image
#benaryorg

Ah, that makes sense of course.
Didn't figure which kind of scripts you meant exactly.

I agree that for example a build-system (Travis CI, GitLab CI, buildbot,.…) should run most scripts with -x (exceptions exist, most notably, those that have lots of pipes produce rather unreadable debug-output then).
However, if those scripts are the same ones used for e.g. (manually) deploying the application, then arguably the -x output should be suppressed by default since a manually triggered deployment should produce as little output as possible.
The build system can always call the script using bash -x notation if needed then.

There's quite a big difference between outputting what a bash script is doing in an automation system vs. having a long running service print it's debugging output to your system logs.

Depending on your "automation system", a shell script and a long running service should both have proper logging facilities that permit them to log different output to different data-sinks.
A script running in a build-system may then produce artifacts with different log-levels each and only output relevant information in the user-visible output.
I've had to scroll through an eternity of brightly coloured output too often, that was obscuring the real error, often bothersome, sometimes even misleading, because relevant build-information was not properly separated from the tooling's own output (where the tooling equates to your -x).
This holds true for build-systems, linting, and most things running in a CI system.

If you don't plan for the "when something goes wrong"-path, then instead of being a minor nuisance, a failed deployment can take down your website for hours, because your deployment didn't just say "running database migration" and "deploying to first container" followed by "health check failed, aborting" so that everyone knows what to do next (check whether the first container has spun down correctly, eventually rollback the database migration), three sysadmins are busy for an hour going through your monstrous output, trying to find the actual error.
Sure, when they finally find it, they immediately know what error it was that made the health check fail and they can report it properly without consulting another logfile, but your site might have been down for an hour by that time, because the database migration was faulty.

Thread Thread
lietux profile image
Janne "Lietu" Enberg

I sure hope that

1) you have a way to rollback failing releases instead of having your production down for an hour because you have a typo somewhere in your database migrations or similar
2) you don't build your massive deployment systems incl. migrations etc. with shell scripts

Yes, there's different log levels for different things, and since shell scripting is shell scripting it shouldn't really be used to build massive scripts, and also because it's shell scripting and prone to all kinds of mysterious and obscure errors everywhere you benefit a lot from having the -x enabled.

Instead you should use something more akin to e.g. Python (maybe using libraries such as Invoke or Pynt), possibly Ruby/Go/Rust or other such language with tooling that helps you build solid CLI tools.

And then you should use the best tools like Spinnaker, or e.g. Azure DevOps's release pipelines, to orchestrate your releases on the higher level so you don't have to reinvent the wheel while building your release tooling.

Collapse
benaryorg profile image
#benaryorg

The set -euo pipefail only really works for bash 4.4 and newer, you might want to double check there.

I myself am pretty fluent in writing safe shell-code, and the most important advice I can give is "quote everything everywhere, anytime".
The remaining pitfalls are arbitrary rules that you cannot really put in generic advice (like echo not being safe to print variables).

My go-to resource would be the one I linked to above: github.com/anordal/shellharden/blo...
It's long, yes, but you decided to go for something that's historically grown when you decided to use the shell, so be ready for long lists of arbitrary rules you have to follow that seem useless or redundant, but make a difference.