Bourne Shell Scripting/Environment

From Wikibooks, the open-content textbooks collection

Jump to: navigation, search

Contents

[edit] The Environment

Each running program, either started directly by the user from the shell or indirectly by another process, operates within a collection of global resources called its "environment."

A program's environment contains important information such as the source and destination of data upon which the program can operate (also known as the standard input, standard output and standard error handles). In addition, variables are defined that list the identity and home directory of the user or process that started the program, the hostname of the machine and the kind of terminal used to start the program. There are other variables too, but that's just some of the main ones.

The environment also provides working space for the program, as well as a simple way of communicating with other, future programs, that will be run in the same environment.

[edit] Environment Variables

Included in this environment is the ability to assign pieces of labelled storage in the environment, where you can store anything you like, as long as it fits. These spaces are called variables, because you can vary what you put in them. All you need to know is the name (the label) that you used for storing the content. The Bourne shell also makes use of these "environment variables". You can even make scripts that examine these variables, and those scripts can make decisions based on the values stored in the variables.

[edit] Numbered variables

These are the variables you may have started off with when you first started working with shells and scripts. An example follows:


-- example1.sh --
#!/bin/sh
echo "$1"


First, this script starts with what has been called the hashbang line (also commonly called 'shebang'). You would have seen this mentioned in other scripts, and it simply tells what program should be used to read the script and to execute the commands in the script. Next, you'll see a command: echo, which simply repeats what you give it. Only, instead of actually repeating the "$1", the shell will substitute what is stored in $1 instead.

Each time you type in a command, and add parameters to it, these parameters are stored in the environment variables $1 .. $n (where n is the total number of parameters you give; if you give four parameters, $1 to $4 will be assigned, if you use seven, $1 to $7 will be assigned, and so on) for use by the script later on. When you run this script, you'll probably get output that looks like this:

[bannock]/home/wanderer$ sh example1.sh Macaroni
Macaroni
[bannock]/home/wanderer$ sh example1.sh Macaroni Cheese
Macaroni


But hang on a minute. What happened to the "Cheese" portion? Well, the mouse ate it. Actually, the script was never told to do anything with the variable $2, so it didn't do anything. All that was said was to show you the first variable's contents to you: echo "$1", so that's all it did. Here's a better script that might do what you want.

-- example2.sh --
#!/bin/sh
echo "$1" "$2"


Now when you run this command, you get the expected result:


[bannock]/home/wanderer$ sh example2.sh Macaroni Cheese
Macaroni Cheese                                                                             
[bannock]/home/wanderer$


Now, what happens if you want to repeat everything that you pass into the script, and not just the first two? Here's an even more useful script.

-- example3.sh --
#!/bin/sh
 while [ "$1" ]; do
   echo -n "$1 "
   shift
 done

This script starts off the same way, but it certainly doesn't LOOK like the same once you get into the rest of it. Let's explain it a bit more. First, we have the hashbang line, that's something we already know about. Then, there's a new command: while. This command works a bit like this:

while (this test is true); do (these commands); done                                                                                                                                                                                       

The (this test is true) is a condition "test" - if it passes, then (These commands) are executed. Then the neat thing is, the command loops back and checks again whether (this test is true), the main reason it's called a while command: do this while the test stays true. If the "test" fails, the commands are not executed, and the script continues running after the line containing "done". You can put almost anything you want into the "this test is true" spot, as long as it returns one of two things: true or false. You can also put anything into the "these commands" slot, which execute if the test returned true.

[edit] Shifting numbered variables

In the case of this script, we're asking if there is something of substance in the variable $1 in between the "" symbols. If so, then we echo what is stored in $1 (which is the stuff we asked about), then we shift all the variables up by one place, then run the test again, continuing the process until there are no more variables with stuff stored in them. That's what the "shift" portion does - takes what's stored in the current $2 variable and putting that into $1, taking whatever's stored in $3 and putting it into $2, and so on, until there are no more numbered variables that have anything in. All this stuff was stored into the environment when you ran the program, and added the parameters. If you were mad enough to have entered 3,000 parameters in, then all of these variables would have been shuffled up each time we executed the shift command, and for 3,000 parameters, that's a lot of shifting. Another thing to note is that you can only access the first nine numbered variables directly, because typically a shell doesn't use $10 upwards, and there's no way of accessing these variables without shifting them up the line until they fall between $1 and $9. Once you shift variables, there's no way of retaining what was originally stored in those variables. What was stored in $1 is gone, replaced by what is in $2, unless you saved the previous contents in another variable.

We've added something to the "echo" command so that it doesn't add a newline character to what it puts out, that's the -n that you see directly after the word "echo". And we've added a space after what echo repeats, justsoyourwordsdon'tallruntogether when you run the command. After all, we'd like you to be able to read what you entered in.

[edit] The test command [

The last thing to mention is the command string [. That's right, [ is a command too. It's normally short for "test", which is also a command you find commonly in a shell. This command literally "test"s for what you want a true/false check on. One thing to remember with any command is that you MUST separate it from whatever follows it by using at least one "whitespace character", otherwise the script will think you're trying to make a new command out of whatever's in $1, only with a [ added to the front. And I'm sure you don't have a command on your computer called [Macaroni. Or do you? Anyhow, the test command also needs to be finished, or "terminated" by using the ] character. It acts as a special "I'm done testing stuff now" command to the test command. It's good style to also separate the terminating ] with a space too, but it's not necessary.

[edit] Starting things off

Now to a more useful script, and one that your system probably starts up for you every time you log in.

#!/bin/sh
if [ -f /etc/profile ]; then
 . /etc/profile
fi
PS1= "`whoami`@`hostname -s` `pwd` \$ "
export PS1


First, we have the usual hashbang line. Then we have a new command that uses the test command; "if". It works by saying:

if (test returns true); then (do commands); fi

The "fi" at the end completes the command. The (test returns true) is the test expression; if this returns "true" then the (do commands) segment is executed. Then the script continues on after the line with fi. This works like the while command, but doesn't loop back to the start of the command like the "while" command does, so it will only do the test once, and only do the commands once at most, for each time that you run it.

Anyhow, inside THIS if command, the first check is another [ command. Remember that [ is short for the command test, and this particular test command (-f) checks whether the file /etc/profile exists, then finishes the test condition with ]. If the file exists (i.e. the test succeeds), the next step is to read the file /etc/profile into the current session, and execute the instructions in the file as if they were being executed in the current environment session, rather than in a subshell. This was covered in an earlier chapter.

[edit] Named variables

The next command is to "assign" something into the environment variable called PS1. In this case, we're taking the output of the three commands whoami, hostname, and pwd, then we add the $ symbol, and some spacing and other formatting just to pad things out a bit. Whew. All that, just in one piece of labeled space. As you can see, environment variables can hold quite a bit, including the output of entire commands. The neat thing is, you don't have to call them by numbers. Many named environment variables exist, some examples are: PATH, HOME, PS1, HOST, LINES, COLUMNS, EDITOR, VISUAL and SHELL. And you can add to this list by defining your own named variables.

[edit] Exporting to the next generation

The last command in this file is the command export. This command exists to make the variable available for use in any subshells that are called from your current shell session. When your initial shell starts up, you have a number of variables already defined, and when you start a program, that program will see a copy of whatever variables have been defined and exported already. If you do not export variables, then when you start up a subshell (and most commands you run will be started in a subshell) then you simply won't see the non-exported variables in the subshell's environment. They won't exist as far as that subshell is concerned. So we typically export any variables we want subshells to use.

There is a useful reason NOT to export variables, and it usually has something to do with not handing off too much environment data to a process if it doesn't need that data. This was particularly true when running copies of MS-DOS and versions of DOS under Windows. You only HAD a limited amount of environment space, so you had to use it carefully, or ask for more space on startup. These days in a UNIX environment, the space issues aren't the same, but a shell can end up with a large amount of baggage it really doesn't need, simply because the parent shell had those variables exported.

[edit] Multitasking

TODO

TODO
expand section, describing how to create and manage jobs...

Doing heaps of stuff all at the same time has become common now, with the arrival of fast computers and CPUs that can switch between all these tasks in a very small amount of time. This fast task switching provides the illusion that the computer really is running all these tasks simultaneously at once. Along with this, had to come an environment in which all this became possible. Terms like "task switching" have rapidly been replaced by "multitasking" because it sounds better than task switching.

Anyhow, whatever it's called, you now have the ability to start a job, and have the computer come back to you straight away saying "Now what do you want to do?" while the first job munches away in the background. You can then start a second job, and, provided it doesn't interfere with the first job that you're doing, you really CAN do more than one thing at a time. This works well, until one of the other jobs wants your attention - for example it might want you to put in a password, or you might need to be told that the job has finished and here are the results. In short, this is all something handled by a multitasking environment. If you're working in a graphical user environment such as Windows or XFree (or its new brother Xorg) or MacOS, you'll be familiar with this term.

[edit] System multitasking

If you're in console (that is, text) mode without any graphical user interface to help out, then you might wonder how you can do multitasking at the command line. Though you might not be aware of it, your computer is already doing that, whether you have a GUI running or not. You're just not seeing the obvious signs of programs working behind your keyboard, because they're usually off doing their own thing, without a direct need to write to the console.

Only in rare cases would you see a Bourne Shell on a single-task system, and usually such systems have their own shell or operating system.

[edit] Common multitasking jobs

Let's start off with something most of you know about. Downloading files. Usually, while you're downloading files, you want to do other stuff too, otherwise you're going to be sitting at the keyboard a really long time when you want to download a whole CD worth of data. So, you start up your file downloader, and feed it a list of files you want to grab. Once you've entered them in, you can then tell it "Go!" and it will start off by downloading the first file, and continuing until it finishes the last one, or until there's a problem. The smarter ones will even try to work through common problems themselves, such as files not being available. Once it starts, you get the standard shell prompt back, letting you know that you can start another job. If you want to see how far the file downloader has got, simply checking the files in your system against what you have on your list will let you know how far the program has got so far, but another way to notify you is via the environment.

The environment can include the files that you work with, and this can help provide information about the progress of currently running jobs, for example that file downloader you started. Did it download all the files? If you check the status file, you'll see that it's downloaded 65% of the files, and is just working on the last three now.

Other examples of programs that don't need their hand held are programs that play music. Quite often, once you start a program that plays music tracks, you don't WANT to tell the program "Okay, now play the next track". It should be able to do that for itself, given a list of songs to play. This is where the GUI really does help out, as it can place an unobtrusive status window onto your screen, showing you the title of the song, and how far through it has got. The same can be done for programs in console mode as well, but you normally have to do more work.

[edit] Creating a job

Jobs are the programs you start within the shell. By default, jobs normally suspend the shell and take control over the input and output. However, they can instead be linked to another job by a pipe (as discussed in the Redirection chapter) or placed into the background.

You can create a background job by adding an ampersand at the end of the command:

[bannock] /home/wanderer$ ls * > /dev/nul &
[1] 4808
[bannock] /home/wanderer$

Upon creating a job, the job id and Process Id will be written to the output before executing the command. When the task finishes, you will receive a notice similar to the following:

[1]+  Done     ls * > /dev/nul &

[edit] Status of jobs

Jobs can be in any of several states, sometimes even in more than one state at the same time. You'll probably see examples of the status flag whenever you run the process view command from the shell and look at the STATUS column.

[bannock] /home/wanderer$ ps x
  PID TTY      STAT   TIME COMMAND
32094  tty5    R     3:37:21 /bin/sh
37759  tty5    S     0:00:00 /bin/ps


[edit] Running

One state that's really common is: running

This is where the job is doing what it's supposed to do. You probably don't need to interrupt it unless you really want to give the program your personal attention (for example, to stop the program, or to find out how far through a file download has proceeded). You'll generally find that anything in the foreground that's not waiting for your attention is in this state, unless it's been put to sleep (more on that later).

[edit] Sleeping

Another really common state is sleeping. When programs need to retrieve input that's not yet available, there is no need for them to continue using CPU resources. As such, they will enter a sleep mode until another batch of input arrives. You will see more sleeping processes, since they are not as likely to be processing data at an exact moment of time.

[edit] Stopped

The stopped state indicates that the program was stopped by the operating system. This usually occurs when the user suspends a background job (e.g. pressing CTRL-Z) or if it receives SIGSTOP. At that point, the job cannot actively consume CPU resources and aside from still being loaded in memory, won't impact the rest of the system. It will resume at the point where it left off once it receives the SIGCONT signal or is otherwise resumed from the shell.

[edit] Zombie

A zombie process appears if the parent's program terminated before the child could provide its return value to the parent. These processes will get cleaned up by the init process but sometimes a reboot will be required to get rid of them.

In other languages