Good Practices For Writting Shell Scripts-1
Good Practices For Writting Shell Scripts-1
scripts
Good practices for writing shell
scripts | Yoann Bentz <Yoone.eu>
I have seen so many messy shell scripts in my not so long life, and the reasons are always
the same: “I’m the only one to use it”, “It’s meant to be used once”, etc. The problem is that
in practice, those scripts are often reused or dug up months or even years later because
they solved a problem similar to one you are having now. And when you have to understand
a shell script that has been running in production for years and has been written by someone
else, you will have to go through the famous “dusting” part i.e. understanding the code and,
for the most courageous, cleaning it up before thinking of editing anything.
In reality you can almost never expect a developer to provide a test suite for his or her shell
scripts. That is because writing them is often a time sensitive task and adding tests would
lengthen that task quite a bit. It can also be because the scripts themselves interact with too
many components available in production which makes them untestable without having a
whole environment set up for the sole purpose of testing them.
To try and avoid having issues for maintaining or exploiting shell scripts, I am going to point
out a few “good” practices that I try to follow myself. I obviously can’t make everybody agree
on all of them, but I believe that they could ease the work of a lot of developers if they were
followed.
This article was written with the help of Vincent Charpentier who contributed by giving me
some precious advice on shebangs and set .
When adding a shebang, you might be tempted to add the absolute path of the shell binary
you want. I will advise against it, in favor of letting env find it for you because it could be
situated somewhere else on some other operating system your script could be executed on.
\
Last but not least, don’t just use the shell with the most features you have installed on your
OS. Keep in mind that while most systems have sh and bash by default, they might not have
zsh, or any other shell you might want to use. Start with sh and change it to bash if you need
some more “advanced” or non-POSIX syntax.
#!/usr/bin/env sh
# ... to start#!/usr/bin/env bash# ... if sh is not enough#!/usr/bin/env
zsh# ... what, isn't bash enough? :)
On an important note, don’t pass arguments to your shell in the shebang. It will result in
an undesired behavior as everything following the first whitespace will be considered as the
first argument. It means that the command is only split in two. Here is some commented
code with an example and the solution to the problem it causes:
You now need to name your file. I don’t have a fixed rule as to which extension should be
used, as long as there is one and it makes sense. You could use .bash or .zsh depending
on which shell is referenced in the shebang, or simply a generic .sh for all your scripts.
Adding such an extension allows whoever is going to see your scripts to distinguish between
them and ordinary binaries. It will also allow you to search for them easily with a command
such as: find . -name '*.sh' .
Arguments
You can always handle arguments very easily, with a piece of code like this:
#!/usr/bin/env sh
SOME_PARAM="$1"OTHER_PARAM="$2"
I will instead suggest using named arguments, handled by a case/esac. The following
example shows how to do it, and it is pretty straightforward:
#!/usr/bin/env sh
# Default values for blank parametersDEBUG=0IN_FILE=/etc/some-input-
file.conf
OUT_FILE=/var/log/some-output-file.log
# Option parser, the order doesn't matterwhile [ $# -gt 0 ]; docase"$1" in
-i|--input)
IN_FILE="$2"
shift 2
;;
-o|--output)
OUT_FILE="$2"
shift 2
;;
--debug) # Argument acting as a simple flagDEBUG=1
shift 1
;;
*)
break
;;
esacdone# Some simple argument checks
wrong_arg() {
echo "Error: invalid value for $1" >&2
exit 2
}
I don’t think having a “usage” is necessary with such a shell script, as the option parser
makes the use of the script self-explanatory.
Use functions!
What I am about to say is true for all programming languages, but people tend to forget it
when it comes to writing shell scripts.
Using functions to clarify your code or avoid duplicating too many instructions is always
good. It eases the maintenance and update process, and it helps understand the code
better. Having Donald Knuth’s words in mind, I don’t think you could go wrong by adding a
few functions to your scripts instead of copy pasting code.
This might take you back all the way to C89 but I also like to declare my variables at the
beginning of the script or function, depending on the scope I am in. Loop variables are
excluded from that “rule” because I think it would defeat the purpose of being easier to read
and understand.
Don’t use too many temporary variables if you don’t need to. That brings me to the next
topic: subshells. Use them as much as possible, and combine them with pipes. Do that and
erase all “bad” habits of using too many temporary variables. I will take this opportunity to
provide a quick reminder about subshells: I have often seen the backticks syntax used to
declare subshells, but they need to be saved in temporary variables if those need to be
nested. There is also another syntax, which I use more frequently than backticks:
$(command) . It supports nesting and is, in my opinion, more readable. Avoid using this
syntax with sh , as it was initially not supported by the traditional Bourne shell.
#!/usr/bin/env bash
# Code showing two methods for calculating the total size of all GIF#
files found recursively from the current directory.# Method 1: Without
intelligent use of subshells or pipingTMP_FILE=gifs.tmp
find . -name '*.gif' > $TMP_FILEGIFS_TOTAL_1=0while read l; doSIZE=`stat -
c%s $l`GIFS_TOTAL_1=$(($GIFS_TOTAL_1 + $SIZE))done <$TMP_FILE
The second example is in my opinion easier to understand because fewer commands are
executed. It is also less likely to cause performance issues because it is opening fewer file
descriptors than the code in the first example and it does not use the disk to store temporary
data.
#!/usr/bin/env bash
set -e makes the script exit if a command fails (exit code different from 0 ). It can
help find invisible errors before the script is used in a production environment.
set -u makes the script exit if a referenced variable is not declared. Pretty useful
since an undeclared variable can be mistaken for an empty one.
set -o pipefail affects pipes to once again avoid invisible errors. If one command in
a pipeline fails, its exit code will be returned as the result of the whole pipeline. As you
can imagine, it makes debugging a lot easier when combined with -e . \
N.B. This option is not recognized by the Bourne shell.
IFS stands for Internal Field Separator. Each character it contains will be used as a
delimiter when splitting a string (in a for loop for instance). \
N.B. Default IFS for zsh is: $' \n\t\0' .
The unofficial Bash strict mode is described with more depth and straightforward examples
by Aaron Maxwell on his blog.
N.B. Although set can be used in most cases, setopt is the zsh equivalent. See the zsh
documentation for more information.