Introduction to bash

Spyros Trifonidis - 20 Sept. 2024

Introduction

You've just installed some Linux distribution and a guide you are reading online instructs you to run a program called terminal and write some commands in it.

You've just managed to connect to the Linux boxes of your university department using SSH.

You've just booted up WSL in your Windows workstation.

You're immediately greeted by a black command line which you're going to be seeing a lot from now on.

username@hostname:~$ █

What you have in front of you is called the prompt of a shell. It is one of the ways we can interact with the shell. What is this shell thing however, that comes up all the time when you search about anything related to Linux?

If you want to follow the rest of the article by running the snippets you will come across make sure that the shell you are running is bash. You can easily verify this by running echo $BASH. If this command does not print anything then the shell you are using is not bash. Search the internet for instructions on how to install it on your distribution and start it by running bash.

What is it?

The shell is a program like any other. It has many uses but it most commonly used to:

Interact with the system (for example to run other programs, to interact with the operating system, to create files etc.)
Run programs written in its language (the so called shell scripts).

Abstractly, the shell is the interpreter for a programming language. We can therefore think of the prompt as a command line that the interpreter will execute.

Equipped with the basic terms surrounding the shell, we will move on focusing on a particular shell called bash. The reason we will be focusing on bash is that it is the most widely known interactive shell so it is very likely that you will encounter it. Besides that, there is a heap of programs written for it and being able to understand and extend them is a very useful skill.

There are many other interesting shells you can take a look at like:

fish - my own favorite choice for an interactive shell
zsh - the default user shell on macOS from Catalina onwards
nushell - a fresh take on a shell
oil - an experimental shell that embraces bash

The article is written using bash 5.2.26 on Linux
All the information in this article are contained in the Bash reference Manual (BRM from now on)
- you can find it online
- or read it interactively on your machine by running info bash in your shell.
This article assumes that its readers have some basic understanding of Unix-like operating system concepts like processes, file descriptors, filesystems etc.

How does it work?

The programs we write in bash, whether they are complete shell scripts or simple one-liners, are based on some foundational features. In this chapter these foundational features will be introduced.

Commands

The most foundational feature, used to do almost anything with bash, are commands. bash executes line by line and each line is made up of variable assignments, command execution or both. There are various types of commands:

Builtin commands. They are commands which are implemented by bash and embedded in it. The are usually commands that are easier to be implemented that way (for example cd, pwd, echo)
Keywords. They usually are commands that deal with program control flow (for example if, for, function).
Functions. They are commands that have been defined in the *shell and are essentially lists of other commands that are executed when the name of the function is used as a command.
Executable files. If the command is the name of an executable file which exists in a filesystem path which is contained in the PATH variable then bash will execute it.
Shell scripts. If the same preconditions of executable files hold except the file being executable then bash will assume that the file is a shell script and it will try to execute each of its lines one by one like commands.

To learn the type of a command you can use the type builtin command.

username@hostname:~$ type declare
declare is a shell builtin

username@hostname:~$ type if
if is a shell keyword

username@hostname:~$ type cat
cat is /usr/bin/cat

To find information about a keyword or a builtin (besides looking it up in the BRM) you can use the help builtin:

username@hostname:~$ help declare

Variables

Like any other programming language, bash supports assigning values to variables. Moreover some variables have special meaning to bash and control some aspects of its behavior. There are 2 types of variables:

global variables which are accessible from anywhere in a bash program and can be set before the program is executed.
local variables which can be created and are accessible on within a fuction but also by functions called by the defining function.

There are many ways to declare variables in bash but the simplest are:


# Declare a global variable FOO and assign to it the value "bar"
FOO="bar"

function foo {
    # Declare a local variable bar and assign to it the value "baz"
    local bar="baz"
}

To get information about a variable you can use the declare builtin:

username@hostname:~$ MY_VARIABLE="foo"
# Shows how this variable was declared along with its value
username@hostname:~$ declare -p MY_VARIABLE
declare -- MY_VARIABLE="foo"

# Shows the value of every variable in this shell instance
username@hostname:~$ declare

Expansions

Before executing the command of each line, bash tries to find some specific patterns in the line and replace them with something else based on some predefined rules. This process is called expansion and it allows bash users to form commands with complex or many arguments.

There are many kinds of expansions which are covered in BRM chapter 3.5. bash expands expansions it finds in each line in 4 "phases". The expansion performed by a phase can be expanded further by next phases. These phases, along with a look of what happens during each phase, are:

1st phase
- brace expansion 3.5.1 - Expands patterns like mkdir a/{b,c} to mkdir a/b a/c.
2nd phase
- tilde expansion 3.5.2 - Expands ~ to the home directory of the current user.
- parameter and variable expansion 3.5.3 - Expands $VAR_NAME to the value of the variable VAR_NAME.
- command substitution 3.5.4 - Expands $(command) to the output of executing command.
- arithmetic expansion 3.5.5 - Expands arithmetic expressions like $(( 4 + 3 )) to its arithmetic result, here 7.
- process substitution 3.5.6 - Expands <(command) to the path of a file that contains the output of executing command.
3rd phase
- word splitting 3.5.7 - "Splits" the result of previous expansions which are not contained within "" to multiple "words".
4th phase
- filename expansion 3.5.8 - Expands some special patterns to files matching the pattern.

⚠️ I want to mention once more that results of an earlier expansion phase can be picked up and further expanded by subsequent phases, fun! ⚠️

Next we will take a closer look at some famous and frequently used expansions. To get a sense of how they work we will use the echo builtin which just echoes its arguments.

Tilde expansion

Probably the most famous expansion, the tilde expansion has many interesting uses covered in BRM chapter 3.5.2. The most famous use however is expanding the character ~ to the home directory of the current user. For example let's say that we have logged in as user user:

user@hostname:~$ echo ~
/home/user

user@hostname:~$ echo ~/Desktop/assignments
/home/user/Desktop/assignments

Parameter and variable expansion

Also one of the most famous expansions, the parameter expansion expansions expands a variable to its value. Like most other expansions, parameter expansion has many uses which are covered in BRM chapter 3.5.3. The most foundational of them is replacing $VAR and ${VAR} with the value of the variable VAR, for example:

username@hostname:~$ MY_NAME=spyros MY_SURNAME=trifonidis
username@hostname:~$ echo $MY_NAME ${MY_SURNAME}
spyros trifonidis

Command substitution

Command substitution which is covered in BRM chapter 3.5.4 expands a command to its output. This is useful to supply commands with input but also to assign values to variables. Its syntax is $(COMMAND), for example:

# Let's say that the HTTP API at my.private.api is protected with some
# access token which we have written to the file at /tmp/access_token
# which we need to send in the "x-access-token" HTTP header
username@hostname:~$ curl -H "x-access-token: $(cat /tmp/access_token)" https://my.private.api/data

# Even better we can assign the access token to a variable so we can use
# it in subsequent requests
username@hostname:~$ access_token="$(< /tmp/access_token)" # $(< file) is equivalent to $(cat file)
username@hostname:~$ curl -H "x-access-token: $access_token" https://my.private.api/data

Redirections & Pipelines

The last foundational feature of bash are redirections. Redirections give us the ability (in their most simple form) to redirect any file towards a file descriptor that the command that is being executed has opened. The file descriptors that we redirect most often are stdin/stdout/stderr. This allows us to compose commands using the so called pipelines. Redirections can appear anywhere in a command, however they are most often appended after the end of the command so it is easier to distinguish them.

Like every other foundational feature, redirections have many uses and capabilities which are covered in BRM chapter 3.6. Here we will dive deeper into input redirections, output redirections and pipelines.

Input redirection

To redirect input we use <. For example, tr reads from its stdin and converts the characters given in its first argument to the characters given in its second argument. So for example having created a file at /tmp/a which contains the line a a a we can do the following:

username@hostname:~$ tr a b < /tmp/a
b b b

What we accomplished here is have tr read from /tmp/a when it reads from its stdin or in other words we have redirected /tmp/a to stdin.

Output redirection

To redirect output we use >. Using cat we will concatenate 3 files and write the result to a 4th file. Having created the files at /tmp/foo, /tmp/bar and /tmp/baz which contain a line with 1, 2 and 3 each we can:

username@hostname:~$ cat /tmp/foo /tmp/bar /tmp/baz > /tmp/foobar
username@hostname:~$ cat /tmp/foobar
1
2
3

Here we have redirected the stdout to /tmp/foobar

Pipelines

Pipelines allow us to combine input & output redirection in order to compose commands. We create a pipeline using | The stdout of the command left of | is "joined" to the stdin of the command to the right of the |.

Let's take a look at an example. bash appends to the file at $HOME/.bash_history every command line we run using the interactive shell. Let's search this file and calculate how many times we've started a command line with cat. To this end we will use grep, which writes to its stdout the lines which match a pattern, and wc which writes the number of lines it reads from its stdin.

# Here ^ means the start of the line. If you want to learn more about the language grep uses (and much more) search fro "regular expressions"
username@hostname:~$ grep '^cat' ~/.bash_history | wc --lines
2 # This will probably be something else for you if you try it

Here we've "joined" the stdin of wc with the stdout of grep, thus composing these programs to achieve our goal.

In case you actually want to do something like that there is no reason to pipe grep into wc. grep can already count the lines that match a pattern on its own using the -c/--count flag.

In action

In this chapter we will use some of the foundational features we've discussed previously to write a useful script which will aid us in writing more complex scripts in the future.

`args`

This is a very useful script that allows us to inspect the results of bash expansions. It's good practice to use this script to double check if an expansion you've written behaves the way you think it does. This script is sourced from Greg's Wiki which has many useful information surrounding bash and much more.

The script is the following:

#!/usr/bin/env bash
printf "%d args:" "$#"
test "$#" -eq 0 || printf " <%s>" "$@"
echo

Let's take a look line by line.

#!/usr/bin/env bash

The first line (and more specifically the #! part) is called the shebang. When we try to run an executable file that starts with a shebang then the operating system will run the program that follows the #! with the path to the file as its first argument. This essentially allows us to run the script (after we've placed it somewhere inside $PATH) without explicitly running bash. In other words we can run args instead of bash args.

printf "%d args:" "$#"

This line uses the printf builtin to print the number of arguments given to the script. bash expands $# to this number

test "$#" -eq 0 || printf " <%s>" "$@"

This line, albeit the most complex, introduces a very useful type of command which are called lists (BRM chapter 3.2.4). When bash is given a command like COMMAND1 || COMMAND2 then it will execute COMMAND1 and if and only if it fails (ie. its status code is anything other that 0) then it will execute COMMAND2. Thus, in this line we use the test builtin to check if we have 0 arguments given to the script and if we do not we use printf to print EACH word to which $@ expands to. $@ is a special expansion which expands to each argument given to the script as a separate word.

echo

This line simply prints a newline character to separate the output.

Some example executions:

username@hostname:~$ MY_VAR="a lot of words"
username@hostname:~$ args $MY_VAR
4 args: <a> <lot> <of> <words>
username@hostname:~$ args "$MY_VAR"
1 args: <a lot of words>

# Using what you've learned up until now alongside the BRM can you 
# explain why this is the result?
# (hint: the answer is given at the source of this script)

That's all for now

In this article we've learned what a shell is and why we use it. Next we looked at some foundational features of the bash shell which most scripts depend on. Finally we wrote a simple script using some of these features.

Most of the bash features mentioned here have far more capabilities than those discussed which will be explored in future articles. Until then I suggest you experiment with the features you've picked up here, research some more using the BRM and write some scripts that automate something you need.

Until next time...