Writing Bash scripts can be challenging if you don't know the quirks and perks. In my mother tongue, we use the Yiddish word for quirks and perks; it's called "Shtickim" (plural of "Shtick"). Are you ready to learn more about Bash's "Shtickim"?
This blog post is part of a series that I'm working on to preserve the knowledge for future me that forgets stuff, to assist new colleagues, and indulge programmers like you who wish to love Bash as I do. So let's begin, shall we?
It's A Scripting Language
It's important to remember that Bash is a scripting language, which means it doesn't offer the standard functionalities that a programming language has to offer, such as:
- Object-Oriented Programming is not supported natively
- There are no external libraries like Python's requests or Node's axios, though it is possible to use external applications such as curl
- Variables Typing is not supported, and all values are evaluated as strings. However, it is possible to use numbers by using specific commands, such as test equality with -eq and increment a variable with ((VAR_NAME+1)). Nevertheless, there's a "weak" way of declaring variables type with the declare command.
- Bash's Associative array like Python's dict or JavaScript's Object is supported from version Bash v4.4, and it's important to remember that macOS is shipped with Bash v3.2 (we'll get to that in future blog posts of this series)
- There is no "source of truth" for naming convention. For example, how would you name a global variable?
Pascal_Case
?snake_case
?SCREAMING_SNAKE_CASE
?
As you already guessed, "Bash programmers" (if there is such a thing) face many challenges. The above list is merely the tip of the iceberg.
Here are great blog posts that share the same feelings as I do:
Now that we've covered the fact that I'm in love with Bash, I want to share that feeling with you; here goes.
Variables Naming Convention
Here's how I name variables in my Bash scripts
Type | Scope | Convention |
---|---|---|
Environment | Global | MY_VARIABLE |
Global | Global | _MY_VARIABLE |
Local | Function | my_variable |
In my older Bash scripts, the names of the variables were hard to interpret. Changing to this naming convention helped me a lot to understand the scope of variables and their purpose.
Good Vibes Application
And of course, we gotta' see some practical example, so here's how I implement the above naming convention in my good_vibes.sh
application.
good_vibes.sh
#!/usr/bin/env bash
# ^ This is called a Shebang
# I'll cover it in future blog posts
# Global variables are initialized by Env Vars.
# I'm setting a default value with "${VAR_NAME:-"DEFAULT_VALUE"}"
_USER_NAME="${USER_NAME:-"$USER"}"
_USER_AGE="${USER_AGE:-""}"
complement_name(){
local name="$1"
echo "Wow, ${name}, you have a beautiful name!"
}
complement_age(){
local name="$1"
local age="$2"
if [[ "$age" -gt "30" ]]; then
echo "Seriously ${name}? I thought you were $((age-7))"
else
echo "Such a weird age, are you sure it's a number?"
fi
}
main(){
# The only function that is not "pure"
# This function is tightly coupled to the script
complement_name "$_USER_NAME"
complement_age "$_USER_NAME" "$_USER_AGE"
}
# Invokes the main function
main
good_vibes.sh - Execution and output
export USER_NAME="Julia" USER_AGE="36" && \
bash good_vibes.sh
# Output
Wow, Julia, you have a beautiful name!
Seriously Julia? I thought you were 29
Let's break down the good_vibes.sh
application to a "set of rules" that can be implemented in your scripts.
Code block spacing
Two (2) blank rows between each block of code make the script more readable.
Indentation
I'm using two (2) spaces, though it's totally fine to use four (4) spaces for indentation. Just make sure you're not mixing between the two.
Curly braces
If it's a ${VARIABLE} concatenated with string
, use curly braces as it makes it easier to read.
In case it's a "$LONELY_VARIABLE"
there's no need for that, as it will help you realize faster if it's "lonely" or not.
The primary purpose for curly braces is for performing a Shell Parameter Expansion, as demonstrated in the Global variables initialization part.
Squared brackets
Using double [[ ]]
squared brackets makes it easier to read conditional code blocks. However, do note that using double squared brackets is not supported in Shell sh; instead, you should use single brackets [ ]
.
To demonstrate the readability, here's a "complex" conditional code block:
if [[ "$USER_NAME" = "Julia" || "$USER_NAME" = "Willy" ]] \
&& [[ "$USER_AGE" -gt "30" ]]; then
echo "Easy to read right?"
fi
# Mind that `||` is replaced with `-o`, see https://acloudguru.com/blog/engineering/conditions-in-bash-scripting-if-statements
# Thank you William Pursell
if [ "$USER_NAME" = "Julia" -o "$USER_NAME" = "Willy" ] \
&& [ "$USER_AGE" -gt "30" ]; then
echo "No idea why but I feel lost with single brackets."
fi
In case you didn't notice, you've just learned that ||
stands for OR
and &&
stands for AND
. And the short -gt expression means greater than
when using numbers. Finally, the \
character allows breaking rows in favor of making the code more readable.
Shtick: Using
\
with an extra space\ <- extra space
can lead to weird errors. Make sure there are no trailing spaces after\
.
I assume that using [[ ]]
feels more intuitive since most conditional commands are doubled &&
||
.
Variable initialization
Global variables are initialized with Environment Variables and are set with default values in case of empty Environment variables.
As mentioned in the good_vibes.sh
comments, I'm setting a default value with
"${VAR_NAME:-"DEFAULT_VALUE"}"
In the above snippet, the text DEFAULT_VALUE
is hardcoded, and it's possible to replace it with a variable. For example
_USER_NAME="${USER_NAME:-"$USER"}"
Functions and local function variables
Functions names and local
function variables names are snake_cased
. You might want to change functions names to lowerCamelCase
, and of course, it's your call.
Coupling a function to the script is a common mistake, though I do sin from time to time, and you'll see Global/Environment variables in my functions, but that happens when I know that "this piece of code won't change a lot".
Oh, and make sure you don't use $1
or any other argument directly; always use local var_name="$1"
.
_USER_NAME="${USER_NAME:-"$USER"}"
# Bad - coupled
coupled_username(){
echo "_USER_NAME = ${_USER_NAME}"
}
# Good - decoupled
decoupled_username(){
local name="$1"
echo "name = ${name}"
}
# Usage
coupled_username
decoupled_username "$_USER_NAME"
Functional Programming
This topic relates to Functions and local function variables, where functions are as "pure" as possible. As you can see in good_vibes.sh
, almost everything is wrapped in a function, except for Initializing Global variables.
I don't see the point of writing the init_vars
function, whose purpose is to deal with Global variables. However, I do find myself adding a validate_vars
function from time to time, which goes over the Global variables and validates their values. I'm sure there's room for debate here, so feel free to comment with your thoughts.
Final Words
The "Good Vibes Application" mostly covered how to write a readable Bash script following the Functional Programming paradigm.
If you feel that there's a need to change how you name variables and functions, go for it! As long as it's easy to understand your code, you're on the right track.
The next blog posts in this series will cover the following topics:
- Error handling
- Retrieving JSON data from an HTTP endpoint
- Background jobs and watching file for changes with fswatch
- Git Repository structure - adding Bash scripts to existing repositories or creating a new repository with a Bash CLI application
- Publishing a Bash CLI as a Docker image
And more, and more ... I'm just going to spit it all out to blog posts. Feel free to comment with questions or suggestions for my next blog posts.
Top comments (29)
About the ${} and lonely variables... I disagree with this exception.
Three reasons.
CONSISTENCY!!! To write clean and easy to read scripts one must write consistent code. Consistent style, consistent patterns, consistent tooling. This way a maintainer/reader won't have to switch to "oh, that's a different variable expression. wait, is that dollar sign a part of the thing, or is it to be expanded..? oh, it's expanded. Good. So it's a variable" mode.
READABILITY. Tightly coupled to #1. Moreover, it's A LOT easier to read script and know which parts are dynamic and which aren't when the dynamic ones have clearer visual contrast. The ${} expression introduces 3 characters that make a word look like a variable. $ only has 1 char. It's easier to spot that large clunky ${myvar} than $myvar in the code. Regardless whether the value is lonely, at the EOL/SOL, on the left or spelled in reverse (consistency again!).
SAFETY. Writing
local age=$1
or evenlocal age=${1}
is not the best idea. What if the function was called w/o the parameter? The fn logic will most likely break. I've seen oh too many mistakes like that ending up in production environments going down or even worse -- getting completely destroyed and requiring a full rebuild (means days-weeks of work). Don't do that! Whenever you're accepting and reassigning function parameters, verify whether you've been passed any. One way to do that is[ -n "${1}" ] || return 1
, but that's just tedious and inelegant to write the same block of code in every function. Bash expansion provides a very handy safety switch:local age=${1:?Age not provided}
. Or a fallback to default value:local age=${1:-30}
. This way you can easily control what your variables are. If the value wasn't provided, in first case the function will return 1 and print the error message to stderr "Age not provided". In the second case the fn will continue execution after auto-assigning 30 to the variable age (if ${1} is empty).How does that relate to safety? By writing ${myVar} you will get the habbit of always thinking "do I need to use bash's variable expansion here? Perhaps default values? Or failfast safety switches? Substitution perhaps..?". Back in the days I was using $ I used to forget to check what's inside the $1 and write failfast mechanisms. By using ${1} now my hand automatically adds the :? or :- and makes me think twice. Even if IDC if the value is empty - I always leave it as
local age=${1:-}
just to make it absolutely clear that this function can tollerate missing parameter(s).I think I should write my own post series about shell/bash scripting.....
I would reply to this comment properly, if you hadn't added the last part
Go for it.
If you have a different opinion, please, do share :) I'd like to know your opinion and motivation behind "lonely variables"
Unless newer versions of bash are playing fast and loose with the language, '
if [ "$USER_NAME" = "Julia" || ..' is an error. You could write if test "$USER_NAME" = Julia || ..., but the [ command requires that its final argument be ]. The || is not an argument.
Superb comment, I never use single
[ ]
brackets, so I'm less familiar with how it all works. I'll fix my answer to fit the proper syntax, thanks!So which naming convention shall I use inside a loop?
for _VAR in
? Thank you for the guide!The way I name it
Output
I usually name my iterator as
item
, even if it's aLIST_OF_ROWS
orARRAY_OF_THINGS
I usually pickitem
as my iterator (in any lang).So, as you can see, there's no clear preference, just go with what suits you, and if you find
item
to be good for you, go for itThank you for the reply! Guess a guide on arrays would also be super useful - the example you made right here was more clarifying than many web articles I have read...
I'll definitely cover arrays in my next blog post, thank you for the feedback, much appreciated!
Also, off the top of my mind: getopt for flags and parallel to avoid writing loops in scripts (I’m just being creative)
Would you believe me if I told you that I've never, ever, used
getopts
orparalllel
? 🙈 I Never had the need to ... I found better alternatives that were easier to "memorize".getopts - See bargs; A framework for creating a Bash CLI. You would expect me to use
getops
for the "Usage" menu ... But I haven't, I used other tricks to make it workparallel - I've never processed big files in Bash, if I need to go down this road, I'll probably use Go, where it's more inclined towards parallelism and threading. I do use
background jobs
, but mainly for short processes which don't require "blocks of data". Instead, I usebackground jobs
, which is simply adding&
to a command andwait
in the bottom of the file to wait for it to finish. For example "downloading 5 files in parallel" or "encrypting 10 files in parallel".It's also a matter of time/knowledge; if you know
getopts
andparallel
then use them. If you don't, feel free to pass it, as I haven't found it "mandatory", but that's me.I didn’t know about bargs! That’s precious advice, thank you!
this blog is impressive, makes me learn a lot
Thanks for sharing. Will follow and wait for more.
Kudos, will post the next part soon :)
I'm usually set environment variables for the Bash scripts (or any other executable) like:
$ USER_NAME="Julia" USER_AGE="36" good_vibes.sh
Depends on the context ...
For local usage, sure, that works great.
For CI/CD processes, usually, values/secrets are injected with env vars ...
My mindset - fetch default values from env vars. A matter of approach/opinion I guess
Me too. That's better than exporting them to the env.
sudo comment "Great stuff! Will wait for the next parts!"
You don't need sudo for this. 😉
Thanks! Much appericiated!
Nice! I'm happy to see that somebody agrees that bash is still relevant and is worth to learn.
Will you cover the readonly "qualifier"?
I'm using Bash across most of my projects, so for me it's definitely here to stay :)
Truth be told, I haven't used readonly in any script, ever. I guess it's best practice to create immutable variables with:
Looks cool and easy to implement, I might adopt it in my future scripts.
Really nice article, but some explanation is needed.
Why?
@zoulja Good point!
I prefer using named variables instead of positional arguments. So the logic of my code is based on names instead of indexes, such as
$1
,$2
and so on. This way, even if the order of given arguments is changed, I still maintain my function's logic.And of course, let's learn by example; assuming we create the
greet()
functionNow, I want to change the positional arguments, so the function consumes
age
and thenname
I hope that explains it
Great timing! I was just looking into creating a couple scripts today.
Glad I could help!