http://www.stat.ufl.edu http://www.stat.ufl.edu

Unix Introduction / Linux Shell Scripting Tutorial

Overview

This is an introduction to Unix, and a beginning tutorial on Linux shell scripting. Shell scripting is an form of programming, which can save you a lot of time. It builds on knowledge gained from interactive use of the shell.


What is the shell

[Start Gnome Terminal] The “Terminal” program is a software emulation of this piece of hardware, which was introduced in 1978:

Way back before GUIs were affordable, the computer's user interface consisted of a single “terminal” per user. This style of computer interface is what the Terminal program emulates. Mostly what a terminal could do is write a character at any location, in a plain or bold, blinking, or reverse video font, and beep.

The program on Unix the user directly communicated with was called a “shell” because it wrapped the softer operating system innards with a more protective layer. The shell we use today is named “bash”, and it is mostly backward compatible with shells from the 1980s.

The shell at its simplest

The general pattern of operation for a shell is that the shell prints a prompt, the user types a command, the shell runs the command and waits for it to complete, and the shell prints another prompt. Our prompt contains the username, the name of the machine the shell is running on, and the current directory the shell is in. Tilde is an abbreviation for home directory, and ~username means that person's home directory. [Point the parts of the prompt out] [Run ls, run ls again, run sleep 3, notice it waits]. If you don't want the shell to wait for a command to finish before giving you another prompt, end the command with an ampersand.[emacs &]

Shells today accept most of the basic editing keystrokes you've learned from emacs. [Edit command]. Earlier commands are available as if they were previous lines in a file in emacs, use uparrow or C-p to reach them. [Go up through the history]. To keep Terminal from interfering with some editing keystrokes, turn off keyboard shortcuts in Terminal. [Turn them off in Edit -> Keyboard Shortcuts]. The idea is to save you typing.

Copy in some sample files so we're all working against the same example. The dollar sign at the beginning of the lines below indicates that the line is a shell command; it is not to be typed literally. Unix/Linux is sensitive to the case of letters, and spaces, so these lines must be typed exactly. If you get an error from a command, fix it before you continue.

$ cd
$ wget -r -l 1 http://www.stat.ufl.edu/system/shell-scripting
$ mv www.stat.ufl.edu/system/shell-scripting .
$ rm -r www.stat.ufl.edu
$ cd shell-scripting
$ rm index*
Shell commands accept “flags” and “arguments”. Flags change the operation of the command. Arguments are usually the name of the thing you want the command to affect, which is usually a filename. The “ls” command lists the name, size, owner, permissions, and other details about the files in a directory. [Run ls] If you give the “-l” flag, you get a “long” listing [Run ls -l so [TAB][RET]] Some of the “completion” features you saw in emacs work in shell, but most of them are triggered by one or two presses of the Tab key, not the Spacebar. If you keep your filenames unique in the first three characters and use tab completion, you will save yourself an enormous amount of time and typing. [ls -l th [TAB] [BEEP] [TAB] [list] is [TAB] [RET]]

Pipelines

Every shell command has the idea that it's getting input from somewhere, processing it somehow, and putting the output somewhere. It also has the idea that error messages should be seen by a human. The names for these flows of data are “standard input”, “standard output”, and “standard error”. When you run a single shell command, all three data flows are connected to the “terminal”, which you remember is nowadays a mythical piece of hardware being emulated by the Terminal program. However, more powerful arrangements are possible. The output of one command can be plugged into the input of another command. This is called a “pipeline”, and the individual shell commands in a pipeline are called “filters”. This command prints out the names of all the files and directories that make up the bulk of Ubuntu, then lets them be read a screenful at a time [find /usr -print | more] How many files and directories are there in /usr? [find /usr -print | wc -l][Explain each part of this]

I/O redirection

Shell commands can do input and output to files, and the symbols to indicate this are less than and greater than, rather than the vertical bar or pipe. [more some-dir/unsorted ; sort < some-dir/unsorted ; sort < some-dir/unsorted > sorted ; more sorted] One implication of this is that you should never to write a sort routine yourself. There's a great one on the system that handles all sorts of special cases gracefully, and the performance is excellent.

The individual directories and filenames in a pathnames are separated by a forward slash (/). If you begin a filename with a character other than /, such as some-dir/unsorted, you're referring to the file in terms relative to your current working directory. This is known as a relative path name. On the other hand, if you begin a file name with a /, the system interprets this as a full path name -- that is, a path name that includes the entire path to the file, starting from the root directory, /. This is known as an absolute path name. Filenames on Linux are case sensitive.

Shell scripts

You can write down a series of shell commands into a file, instead of typing them in each time. This is called a “shell script”. Look out, we're about to start programming. [Each of you do this: emacs count-my-files &][Explain each part of this]
#!/bin/bash

find ~ -type f -print | wc -l
[chmod +x count-my-files ; ./count-my-files]

We've reached the big idea:

If you can do it interactively, you already know how to script it.

Shell scripts are a computer programming language. Programming language interfaces are good for complicated or tedious operations, and for automating things that run without interacting with humans. The strongest aspect of Linux is that most applications are already Lego parts.

Common shell commands

Linux CommandAction
lswhat files are in this directory
cdchange directories
pwdwhat directory am I in (also see prompt)
more, lessinteractively read contents of file
ps auxww, topwhat processes (programs) is the system running
killstop running processes
emacsEMACS Makes All Computing Simple
rmdelete file or directory tree of files
grepsearch contents of file
tail -fwatch the bottom of a growing file
findsearch for files by name, date, etc.
cpcopy files
scpcopy files between machines
mkdirmake a directory
rmdirdelete an empty directory
mvrename a file or move a directory
tarpack a bunch of files into one file for transport
chmodchange permissions on files to make them shared or private
duhow much disk space is this using
diff, tkdiffcompare two text files and display only changes
gzipcompress file

Windows and Linux Command Comparison

(This section adapted from http://www.yolinux.com/TUTORIALS/unix_for_dos_users.html)

Windows Command Linux Shell Command Action
dir ls -l (or use ls -lF)(-a all files)
(df -k Space remaining on filesystem)
List directory contents
dir *.* /o-d
dir *.* /v /os
dir /s
dir /aa
ls -tr
ls -ls
ls -R
ls -a
List directory contents by reverse time of modification/creation.
List files and size
List directory/sub-directory contents recursively.
List hidden files.
tree ls -R List directory recursively
cd cd Change directory
mkdir
md
mkdir Make a new directory
rmdir
rd
rmdir Remove a directory
chdir pwd Display directory location
del
erase
rm -iv Remove a file
rmdir /S (NT)
deltree (Win 95...)
rm -R Remove all directories and files below given directory
copy cp -piv Copy a file
xcopy cp -R Copy all file of directory recursivly
rename or move mv -iv Rename/move a file
type cat Dump contents of a file to users screen
more more Pipe output a single page at a time
help or command /? man Online manuals
find
findstr
grep Look for a word in files given in command line
comp diff Compare two files and show differences. Also see comm and cmp.
fc diff Compare two files and show differences. Also see comm and cmp.
echo text echo text Echo text to screen
date or time date Show date, set date with permissions
sort sort Sort data alphabetically/numerically
edit filename.txt emacs Edit a file.
print lpr Print a file
mem free
top
Show free memory on system
tasklist (WIN2K, XP) ps -aux
top
List executable name, process ID number and memory usage of active processes
Chdisk du -s Disk usage.
pkzip tar and zip Compress and uncompress files/directories. Use tar to create compilation of a directory before compressing. Linux also has compress, gzip

Script examples

  1. Set Gnome text boxes, such as the Firefox URL input area, to accept emacs-style editing keys instead of Windows-style:
    $ gconftool-2 --set /desktop/gnome/interface/gtk_key_theme Emacs --type string
    
  2. This is searching all R programs you wrote for the word "median":
    $ find ~ -type f -name "*.R" -print0 | xargs -0 grep -i "median"
    
  3. Install packages in R:
    #!/bin/bash -x
    R --no-save << EOF
    
    install.packages(c("akima"), repos="http://cran.us.r-project.org")
    install.packages(c("assist"), repos="http://cran.us.r-project.org")
    install.packages(c("brlr"), repos="http://cran.us.r-project.org")
    # ...and so on for about 100 more packages...
    quit()
    EOF
    
  4. This is moving aside your old Gnome window manager dot files so that new Gnome won't be confused by them:
    #!/bin/bash -e
    
    cd $HOME
    
    FNAME=OLD-DOT-FILES.`date +'%Y-%m-%d_%H-%M-%S'`
    
    mkdir $FNAME
    
    find .[a-zA-Z]* -maxdepth 0 \( \! \( \
    	-name .ssh -o \
    	-name .bashrc -o \
    	-name .bash_profile -o \
    	-name .emacs -o \
    	-name .emacs.d -o \
    	-name .thunderbird -o \
    	-name .mozilla \) \) \
    	-print0 | xargs -0 --max-lines=1 --replace={} mv {} $FNAME
    
  5. This example, for synchronizing files between multiple machines, demonstrates parsing command-line flags and arguments in detail.
    #!/bin/bash -e
    
    trap '
    SAVEDSTATUS=$?
    set +x
    if [ x$CLEANEXIT = x ]
    then
    	echo "$0: ERROR Unexpected exit with return value of $SAVEDSTATUS"
    	exit $SAVEDSTATUS
    fi' EXIT
    
    #-----
    
    TEMP=`getopt -o wab -n "$0" -- "$@"`
    
    if [ $? != 0 ]
    then
    	echo "$0: ERROR running getopt" >&2
    	exit 1
    fi
    
    eval set -- "$TEMP"
    
    WRITE=''
    ADDONLY=''
    
    while true
    do
    	case "$1" in
    		-w)	WRITE=1
    			shift
    			;;
    		-a)	ADDONLY=1
    			shift
    			;;
    		-b)	BINARY=1
    			shift
    			;;
    		--)	shift
    			break
    			;;
    		*)	echo "$0: ERROR parsing getopt" >&2
    			exit 1
    			;;
    	esac
    done
    
    #-----
    
    SOURCEPATH="$1"
    shift || true
    DESTPATH="$1"
    shift || true
    DESTHOST="$1"
    
    if [ -z "$SOURCEPATH" -o -z "$DESTPATH" -o -z "$DESTHOST" ]
    then
    	echo "Usage: $0 [-wab] source-dir dest-dir target-host [target-host [...]]" 1>&2
    	CLEANEXIT=1
    	exit 1
    fi
    
    if [ ! -e $SOURCEPATH ]
    then
    	echo "$0: ERROR source path $SOURCEPATH does not exist, quitting" 1>&2
    	CLEANEXIT=1
    	exit 1
    fi
    
    DISTOPTS="numchkgroup,numchkowner"
    
    if [ ! -z "$BINARY" ]
    then
    	DISTOPTS="$DISTOPTS,compare"
    fi
    
    if [ -z "$WRITE" ]
    then
    	MSG="READ-ONLY"
    	DISTOPTS="$DISTOPTS,verify,younger"
    else
    	MSG="WRITING"
    fi
    
    if [ -z "$ADDONLY" ]
    then
    	DISTOPTS="$DISTOPTS,remove"
    fi
    
    #-----
    
    for DESTHOST
    do
    	if ping -c 2 -w 3 $DESTHOST > /dev/null 2>&1
    	then
    		echo "$MSG, HOST $DESTHOST"
    		cat << EOF | \
    			/depot/rdist-6.1.5/bin/rdist \
    				-P /usr/bin/ssh \
    				-p /depot/rdist-6.1.5/bin/rdistd \
    				-f -
    			($SOURCEPATH) -> ($DESTHOST)
    				install -o$DISTOPTS $DESTPATH ;
    EOF
    	else
    		echo "ERROR target-host '$DESTHOST' not pingable, skipping" 1>&2
    	fi
    	echo "-----"
    done
    
    #-----
    
    CLEANEXIT=1
    exit 0
    
  6. Here's a template file to use when writing shell scripts. The trap statement emits an error if a command fails. Without the trap and the -e, it is easy for the script to quit early without you realizing it.
    #!/bin/bash -e
    
    trap '
    SAVEDSTATUS=$?
    set +x
    if [ x$CLEANEXIT = x ]
    then
    	echo "$0: ERROR Unexpected exit with return value of $SAVEDSTATUS"
    	exit $SAVEDSTATUS
    fi' EXIT
    
    #-----
    
    ###
    ### Your code goes here
    ###
    
    #-----
    
    CLEANEXIT=1
    exit 0
    

Sources of help

Exercises

  1. If you haven't already, set up your personal web page using the instructions in http://www.stat.ufl.edu/system/homepage.shtml. In this web area, create a very simple web page for a course. Use emacs to edit all the html.
  2. Write a shell script based on the "find" command that prints the names of files in your home directory which are larger than 1 megabyte.
  3. Write a shell script based on the "find" command that deletes the temporary files created by running latex. These files end in the extensions .aux, .dvi, and .log.
  4. Write a shell script named "run-simulation", which could serve as a framework around future simulations you write in R. This script takes one parameter, which gives the number of runs to make. Each time the script runs, it creates a new directory to put its output in, with the name based on the current time, in the format "YYYY-MM-DD-HH-MM-SS". Each run's output goes in a file in that directory, named by the run number. When the script finishes, it sends you a notice by email. Your solution should have all of the features illustrated below:
    $ ./run-simulation 5
    Created directory 2008-08-20-10-45-03
    Running run 1
    Running run 2
    Running run 3
    Running run 4
    Running run 5
    Simulations ended.
    $ ls 2008-08-20-10-45-03
    1 2 3 4 5
    $ more 2008-08-20-10-45-03/3
    Started run for argument 3
    
    3 times 3 is 9.
    
    Ended run for argument 3
    $
    
    When you do run R in a long-running simulation, note that you can save R objects to files from inside the middle of your program with the "save" command. Then, you can inspect those objects in files with another copy of R, to get an idea of how your simulation is progressing. It is very likely you will catch mistakes without waiting for the whole simulation run to complete.
  5. Extend "run-simulation" to report progress of its runs on your personal web site. Add new reports to a history of past runs.
  6. Extend "run-simulation" to generate LaTeX files summarizing the results of a set of runs, as if for inclusion in your thesis. Here are examples of LaTeX syntax for a master file and an included file:
    \documentclass[12pt]{article}
    
    \begin{document}
    
    These are the results of running my program:
    
    \include{1}
    \include{2}
    \include{3}
    
    \end{document}
    Where the included file 1.tex contains:
    Started run for argument $3$
    
    \[3 \cdot 3 = 9\]
    
    Ended run for argument $3$
A final thought from xkcd:

http://imgs.xkcd.com/comics/11th_grade.png


(C) University of Florida, Gainesville, FL 32611; (352) 392-1941.
This page was last updated Tue Sep 25 00:51:43 EDT 2012
http://www.ufl.edu