Writing Bash-Scripts


There's strictly no warranty for the correctness of this text. You use any of the information provided here at your own risk. The terms of usage and/or copying of this text are determined by the GNU Free Documentation License.


Contents:


1. About Bash

Shells are command-line-interpreters.

Bash, the "GNU Bourne Again Shell", is an extended open-source-version of the shell "sh".

"sh" was developed by Stephen Bourne in the 1970's for the operating-system "Unix".
"bash" was developed by Brian Fox for the "Free Software Foundation" in about 1989.

On Linux, "bash" is probably the most commonly used shell, although there are other ones like "sh" (mentioned above), "ksh", "csh" und "tcsh" too.

In shell-programming, data-processing is often done by handing the output of one command over to another command.
So you could say, shell-commands are often used like tools of a tool-kit.
This is a bit different from other programming-languages, but on the other hand there are also things like control-structures in bash.

Bash-Scripts are mainly used for executing processes of the operating-system automatically or for installing programs. Their disadvantages are, they don't run too fast, they are not very portable and their code is quite difficult to read for human beings.


2. Shell-Basics

As a command-line-interpreter, the shell looks for a command and its options to evaluate.

Like in Basic or DOS, the commands are English words, that are shortened so that they are faster to type.
The Linux-commands are quite different from those of DOS, although they often do the same thing.

A Bash-command would be for example

echo "Hello World"

Unlike DOS, Bash is always case sensitive.

If you type TAB, Bash tries to complete user-input automatically.

Using the cursor-keys, you can browse through previously typed user-input.


3. Redirection of Command-Output into Files and Pipes

The operating-system provides three data-streams called stdin, stdout and stderr, that can be used by programs.

stdin is the standard input stream. By default it is connected to the keyboard. So when a program tries to read data from this stream, it waits for keyboard-input.

stdout, the standard output stream, is connected by default with the screen. When a program writes data to this stream, it is usually displayed on the screen. That's how "echo" works: It sends its argument ("Hello World" above) to stdout

stderr is used for error messages. By default it is connected with the screen too.

In Bash you can send the output of a command instead to stdout to a file.
To do that, you use the characters ">" and ">>".

So if you do:

echo "Hello World" > hello.txt

nothing is printed to the screen. Instead the output can be found in a file called "hello.txt" in the current directory.

echo "Hello World" >> hello.txt

appends "Hello World" once more to the file. With

cat hello.txt

you can view the contents of the file "hello.txt".

You have to use output-redirection with ">" carefully, because if a file with the name of the destination-file already exists, it is overwritten with the redirected command-output without warning.
">>" is less dangerous, as it just appends something to the file.

Output of commands can not only be redirected into a file but also into the input stream used by another command.
This is done by using the character "|" (on the keyboard: "AltGR + <"). Here's an example:

ls

shows the contents of the current directory on the screen using stdout. If there are many files in the directory, "ls" scrolls the contents up too fast. With

ls | less

you can view the output of "ls" with the tool "less" screen by screen.

So redirecting a command's output to another command using "|" (which is called creating a "pipe") leads to processing of the handed data by the second command.

By the way: Output-redirection with ">", ">>" and "|" can be done on DOS too.


4. Writing and Executing Bash-Scripts

You can write your own Bash-scripts in a text-editor (like "vim", "emacs", "kate", "gedit", "joe" etc.). They are plain-text-files containing one or more Bash-commands.
To be able to execute a script on Linux, you have to make it executable first. To do that, you need the required user-rights. If you've got them, you can make a script called "script" executable with

chmod +x script

After that you can run the script with

bash script

or you can run the script without invocing "bash" once more expressly. If the script is in the current directory, you can run it just with "script", if the current directory is mentioned in the system's "$PATH"-variable. If it is not, you have to tell the shell expressly, that you are referring to the current directory. This directory is symbolized by ".".
So a Bash-script in the current directory can be executed with:

./script

Then, you can put your scripts into the directory:

/usr/local/bin

This directory is mentioned in the "$PATH"-variable by system's default. Because of that, the scripts that are placed there, can be executed from every directory, just like any other shell-command.

It is useful to mention in the script, that it contains code that should be executed with Bash. To do that, you write as the first line of the script:

#!/bin/bash

This first line of as script is called "sh'bang": "sh'" means "sharp" ("#"), "bang" refers to the exclamation mark.

So, if for example you create a script

#!/bin/bash
echo "Hello World"

call it "mine", set the user-rights to "executable" ("chmod +x mine") and move it to "/usr/local/bin"("mv ./mine /usr/local/bin" as root), you can just enter "mine" in a terminal from any directory to execute the script.


5. Expansion of the Command-Line by the Shell

As Bash is a command-line-interpreter, it looks for a command and its options.
But before a command is executed, Bash checks the input for certain special characters like "*" in

cp * /home/user

and expands them. This expansion also happens with each line of code in shell-scripts.
Therefore you sometimes have to watch out, what the line of code will look like right before execution after it has been expanded by the shell.

The options of the shell-commands are usually separated by space-characters, like in "ls -l -i -s -a" for example.
So if you use expressions, that shall be expanded to command-options by the shell, you have to make sure, the expansion does not lead to unwanted space-characters, that would be interpreted as option-ends.

If you put strings in single quotation marks like in

echo 'Hello *'

Bash interprets the expression in spite of space-characters as a single string. Besides that, there isn't any expansion of any special characters then (like "*" in the example).

If you put strings in double quotation marks like in

echo "Hello *"

the expression is again interpreted as a single string. Most special characters aren't expanded, but some are, especially "$", so that, for example, the value of a variable "$a" in

a=World; echo "Hello $a"

(variables are explained soon) becomes part of the string.


6. Variables

The scalar variables in Bash are usually strings. They are defined just by their variable name. To access their values later, a "$" has to be written in front of their name. Then the shell expands the variable name to the value of the variable:

#!/bin/bash

a="Hello World !"
echo $a

This variable-expansion is done in every single line of the script.

In the line a="Hello World !"

there must not be a space character between "a", "=" and the following word. The reason for that is, in Bash, space characters separate commands from options (like for example in "ls -l"). Only if the equality sign is used in connection with the variable names and -values without any space characters in between, the shell can recognize the expression as a variable assignment.
 


7. Assignment of Command-Output to Variables by Expansion

Like explained above, with "|" the output of one command can be given to another command for further processing using the data input stream it uses ("pipe").

But how do you assign the output of a command to a variable, as variables don't use data input streams ?

To do that, you have to put the command in socalled "backticks" (`) (on the keyboard: "Shift and the key right of ?") or you put the command into "$()" like in:

$(command)

Then the command is executed and its output is expanded to an expression within the command-line. This expression can then be assigned to a variable:

#!/bin/bash

a=$(ls -la)
echo $a

It is often useful to put the expanded expression in quotation marks once more to make clear, that it is a single expression and not single words or numbers, although there may be space characters in the expression.


8. for-Loops in "C-Style"

In BASIC you can do the following:

10 FOR i=1 TO 10
20 PRINT i
20 NEXT i

You can do this in Bash too:

#!/bin/bash

for ((i=1; i<=10; i++))
do
    echo $i
done

for-loops can be created (nearly like in the programming-language C) by writing three things between two round brackets, separated by ";":

  1. A value assigned to the loop-variable, which with the loop shall start.
  2. A condition. The loop runs as long, as the condition is given.
  3. The way, in which the loop-variable shall change after each loop.
    "i++" is short for "i=i+1" here.

Then, the commands to be executed within the loop are written between the lines "do" and "done".

The indentation of the commands in command-blocks (for example with four space characters) is in Bash (different from Python) not required, but it is recommended, because the code gets a bit more readable with it.

The command "break" inside the command-block makes the script exit the loop prematurely. The execution of the script is then continued right after the loop. "break 2" exits two nested loop at once.

The command "continue" inside the command-block starts the next loop prematurely, so the commands after "continue" are not executed any more.


9. for-Loops in "Python-Style"

In Bash, there are also for-loops, in which the loop-variable represents one of several arguments after the word "in":

#!/bin/bash

a="This is a line of text."
for i in $a 
do
    echo $i
done

Please think about the following for a moment: In the for-line, $a is expanded to several single words without the quotation marks. In each loop, $i represents one of these words. If you do instead

for i in "This is a line of text."

$i represents the whole string, so the script only goes through the loop once.


10. while-Loops. "a = a + 1"

Besides for-loops, Bash also provides while-loops, with which the "1-10"-script could also have been written:

#!/bin/bash

i=1

while test $i -le 10
do
    echo $i
    let "i += 1"
done

The condition for the while-loop is created with the "test"-command; please see "man test" for details.

To do "i = i + 1", you have to use the line above

let "i += 1"

Please watch out for the space characters: They have to be placed exactly like shown.
Alternatively, this command works

i=$(expr $i + 1)

but the "let"-command runs much faster (as it's a "Bash builtin").

If you want to execute the while-loop-script interactively in the shell, all commands have to be separated by ";". But there mustn't be a ";" between "do" and the following command:

i=1; while test $i -le 10; do echo $i; let "i += 1"; done


11. if-Conditions

BASIC:

10 LET a=1
20 IF a=1 THEN PRINT "a=1"
30 IF a<>2 THEN PRINT "a is not 2."
40 IF a=2 THEN PRINT "a=2" ELSE PRINT "a is not 2."
50 LET b=2
60 IF a=1 AND b=2 THEN PRINT "a=1, b=2"
70 IF a=1 OR b=2 THEN PRINT "a=1 oder b=2."

Bash:

#!/bin/bash

a=1
if [ $a -eq 1 ];
then
    echo "a=1"
fi
if [ $a -ne 2 ];
then
    echo "a is not 2."
fi
if [ $a -eq 2 ];
then
    echo "a=2."
else
    echo "a is not 2."
fi
b=2
if [ $a -eq 1 ] && [ $b -eq 2 ];
then
    echo "a=1, b=2."
fi
if [ $a -eq 1 ] || [ $b -eq 2 ];
then
    echo "a=1 oder b=2."
fi

As you can see, if-conditions are written like:

if <test-command with expression>; then <commands>; else <other commands>; fi

The expression

[  ]

is short for the "test"-command (please see "man test). The space characters left and right inside the brackets are required.

Besides "else" there is also "elif" (for "else if").


12. Input

BASIC:

10 LET a=0
20 INPUT a
30 PRINT a

Bash:

#!/bin/bash

a=0
read a
echo $a

Like "echo" writes a line of data to stdout, "read" reads a line of data from stdin.


13. Example-Script "Cookie-Monster"

After loops, input and conditions have been described, for once these programming-techniques can be combined.

Please try to find out, what the script does and how it does it:

#!/bin/bash

cookies=""
while test -z $cookies || test $cookies != "COOKIES"
do
    echo -n "I want COOKIES: "
    read cookies
done
echo "Mmmm. COOKIES."

The "||" in the while line represents logical OR. The second test in the while-line is only done, when the first test ($cookies is "") is not already positive.

The first test is necessary, because if $cookies is "", it would be expanded to nothing, so that the "test"-command wouldn't get a suitable argument for other tests than "-z" or "-n", for example for "!="-tests.


14. Processing Script-Options - Positional Parameters

A script can be called with options like in

./script -a

Options like "-a" are stored in special variables "$1", "$2", "$3" and so on, that then can be processed inside the script.
The Variable "$@" contains all of those options in a string.
The number of options, separated by a space character, can be found out accessing the variable "$#".

So in the example above, inside the script the variable "$1" contains the expression "-a".


15. String-Manipulation

Manipulation of strings is in Bash less comfortable than in other programming-languages.

If you have a variable $a, the number of characters of its value can be found out with "${#a}".

"${a:2:3}" delivers a substring of 3 characters, starting at position 2 of variable $a. "${a:2}" delivers all characters from position 2 to the end.

a=${a/from/to}

replaces the first occurrence of "from" in the variable $a to "to".

a=${a//from/to}

replaces every occurrence of "from" in the variable $a to "to".


16. Arrays

Traditionally, arrays (field variables) are rarely used in shell-scripts, although they are supported in newer versions of Bash.

Several array-elements can be assigned at once, but values can be assigned directly to array-positions too.
The first array-position starts at 0 (like in most other programming languages).
The number of array-elements can be found out. Iteration over arrays is possible:

#!/bin/bash

arr=("apple" "pear" "peach")

arr[3]="banana"
arr[4]="cherry"

echo
echo "The array has ${#arr[@]} elements:"
echo
 
for ((i=0; i<${#arr[@]}; i++))
do
    echo ${arr[$i]}
done

echo


17. Functions

For more complex scripts Bash also provides functions.
Functions are independent code-parts, that can receive data as arguments and return results. With functions, large tasks can be split into many small subtasks. Here's an example:

#!/bin/bash

selfdefinedfunction ()
{
    echo $1
}

selfdefinedfunction "Hello" 

When "selfdefinedfunction" is called with the argument "Hello", this argument is automatically assigned to the variable "$1" inside the function.

Local variables inside the function can be defined with the "local"-command.

Subprocesses invoked by the shell and functions can return a value, which then can be accessed through the special variable "$?". Usually this mechanism is used by programs reporting if they completed successfully without errors.
But this method can be used by Bash-functions to return results too:

#!/bin/bash

selfdefinedfunction ()
{
    echo $1
    return 5
}

selfdefinedfunction "Hello" 
echo $?


18. Redirection of Data-Streams. Suppressing Error-Messages

Sometimes you want to prevent a program for example from printing its error-messages to the screen.
This can be done by redirecting the data-stream stderr:

Output can also be redirected to "/dev/null", that is into nothingness:
echo "gone" &>/dev/null


19. Processing of Data in Text-Files: awk, grep, sed; Support by Perl

In general, Bash is designed rather for dealing with whole files and directories than for processing data in text-files.

The command "grep" can be used to filter certain lines from text-files or command-output.
If "grep" is called with a search-string and a file-name, the file is searched for the string and the lines found are written to standard output. So

grep -i word file.txt

searches "file.txt" for lines containing "word" (the search is not case sensitive here, because of the given "-i"-option to "grep") and writes them to standard output.
"grep" is also often used in a pipe like in "cat file.txt | grep word" or "find | grep html".

After some lines have been found by "grep", they often need to be split, because only certain parts of the line are of interest. This can be done using "awk":

Let's say, a text-file "file.txt" contains a line :

One;Two;Three;Four

Then this line would be part of the output of:

grep Two file.txt

To split the line, "awk" needs to get an input, it has to be told, at which character, the line shall be split and what it should do with the result. In the example this command prints the part "One" of the line:

grep Two file.txt | awk -F ";" '{print $1}'

So the split-character is given to "awk" with its "-F"-option and the special "awk"-commands for treating the line-part are put in curly brackets, that are themselves wrapped into single quotation marks.

Theoretically, the stream-editor "sed" can be used to edit the inside of text-files automatically.

But this is rather uncomfortable. That's why often the programming-language Perl is used for such operations instead.

Small Perl-Scripts, that fit into the command-line, can be called from Bash-Scripts. So

perl -e 'print "Hello World.\n";'

prints text using Perl. But typically in Bash-scripts the output of a shell-command is piped into a Perl-command:

echo "Hello World" | perl -e 'while(<>){$_ =~ s/World/Planet/g; print $_;}'

In this example the output of "echo" is changed by Perl and then printed.
The Perl-construct "while(<>){}" reads lines from stdin as long as it is feeded with them.

The "awk"-example with "One" and "Two" above would look with Perl for example like this:

grep Two file.txt | perl -e 'while(<>){@a = split(";"); print $a[0]."\n"}'


20. Iteration over Command-Output or Variables with Multiple Lines

In Bash, the new-line-character is created by expansion of $'\n'. So you can have a variable containing more than one line like this:

#!/bin/bash

a="line1"$'\n'"Hello World"$'\n'"line3"
echo "$a"

But if you want to iterate over such a variable, there is a problem. It can't be done like this

#!/bin/bash

a="line1"$'\n'"Hello World"$'\n'"line3"
for i in $a 
do
    echo $i
done

because the space character between "Hello" and "World" makes the script go through the loop unwanted.

It doesn't help to use quotation marks either like in

#!/bin/bash

a="line1"$'\n'"Hello World"$'\n'"line3"
for i in "$a"
do
    echo "$i"
done
echo "$i"

because then the whole value of $a is assigned immediately to $i, so the script goes through the loop only once.
You can see this, when $i is echoed once more after the loop.

So to iterate over each line of $a, a construction like this may be used:

#!/bin/bash

a="line1"$'\n'"Hello World"$'\n'"line3"
echo "$a" | while read i
do
    echo $i
done

The value of $a is piped into a "while"-loop. In the loop each line of $a is assigned to $i using the "read"-command. The loop runs, until there aren't any lines left to be read from $a.


21. Reading the Contents of a Text-File into an Array

In other programming-languages, this is an often-used operation, but in Bash it is quite tricky.

The following code should read the contents of a given text-file into an array $a and print the array's first element then:

#!/bin/bash

if test -z $1
then
    echo "Please pass the name of the text-file to be read."
    exit 1
fi

x=0

while read i
do
    a[$x]="$i"
    let "x += 1"
done < $1

echo "${a[0]}"

The problem is, often blocks of code (like the while-loop here) are executed by Bash in socalled "subshells".
Subshells are automatically invoked further instances of Bash.
The values of the variables inside the code-blocks aren't visible to the rest of the script then.
So if the while-loop in the example above was executed in a subshell, we wouldn't be able to get access to the array $a in the last echo-line.

Therefore, in the example-script the standard input stream stdin is redirected to the while-loop.
This leads Bash to keep up the original shell-instance, so the variables stay visible to the rest of the script.

The behaviour of Bash concerning subshells and Bash's expansion of the command-line are in my opinion the two main reasons that make writing Bash-scripts sometimes a bit difficult.

Hint: If you encounter the problem, that variables are not visible in parts of the script where they should be, it sometimes helps to put code-blocks in curly brackets "{}".


22. Further Reading (Links)

Bash Guide for Beginners  In spite of its title, this is quite a big manual.
Advanced Bash-Scripting Guide  An excellent reference book covering a lot of topics - if not all - concerning Bash-scripting.
 

Author: abgdf {at-symbol} gmx.net