This is an introduction to Unix, and a beginning tutorial on Linux shell scripting. Shell scripting is an form of programming, which can save you a lot of time. It builds on knowledge gained from interactive use of the shell.
[Start Gnome Terminal] The “Terminal” program is a software emulation of this piece of hardware, which was introduced in 1978:

Way back before GUIs were affordable, the computer's user interface consisted of a single “terminal” per user. This style of computer interface is what the Terminal program emulates. Mostly what a terminal could do is write a character at any location, in a plain or bold, blinking, or reverse video font, and beep.
The program on Unix the user directly communicated with was called a “shell” because it wrapped the softer operating system innards with a more protective layer. The shell we use today is named “bash”, and it is mostly backward compatible with shells from the 1980s.
The general pattern of operation for a shell is that the shell prints a prompt, the user types a command, the shell runs the command and waits for it to complete, and the shell prints another prompt. Our prompt contains the username, the name of the machine the shell is running on, and the current directory the shell is in. Tilde is an abbreviation for home directory, and ~username means that person's home directory. [Point the parts of the prompt out] [Run ls, run ls again, run sleep 3, notice it waits]. If you don't want the shell to wait for a command to finish before giving you another prompt, end the command with an ampersand.[emacs &]
Shells today accept most of the basic editing keystrokes you've learned from emacs. [Edit command]. Earlier commands are available as if they were previous lines in a file in emacs, use uparrow or C-p to reach them. [Go up through the history]. To keep Terminal from interfering with some editing keystrokes, turn off keyboard shortcuts in Terminal. [Turn them off in Edit -> Keyboard Shortcuts]. The idea is to save you typing.
Copy in some sample files so we're all working against the same example. The dollar sign at the beginning of the lines below indicates that the line is a shell command; it is not to be typed literally. Unix/Linux is sensitive to the case of letters, and spaces, so these lines must be typed exactly. If you get an error from a command, fix it before you continue.
$ cd $ wget -r -l 1 http://www.stat.ufl.edu/system/shell-scripting $ mv www.stat.ufl.edu/system/shell-scripting . $ rm -r www.stat.ufl.edu $ cd shell-scripting $ rm index*
Shell commands accept “flags” and “arguments”. Flags change the operation of the command. Arguments are usually the name of the thing you want the command to affect, which is usually a filename. The “ls” command lists the name, size, owner, permissions, and other details about the files in a directory. [Run ls] If you give the “-l” flag, you get a “long” listing [Run ls -l so [TAB][RET]] Some of the “completion” features you saw in emacs work in shell, but most of them are triggered by one or two presses of the Tab key, not the Spacebar. If you keep your filenames unique in the first three characters and use tab completion, you will save yourself an enormous amount of time and typing. [ls -l th [TAB] [BEEP] [TAB] [list] is [TAB] [RET]]
Every shell command has the idea that it's getting input from somewhere, processing it somehow, and putting the output somewhere. It also has the idea that error messages should be seen by a human. The names for these flows of data are “standard input”, “standard output”, and “standard error”. When you run a single shell command, all three data flows are connected to the “terminal”, which you remember is nowadays a mythical piece of hardware being emulated by the Terminal program. However, more powerful arrangements are possible. The output of one command can be plugged into the input of another command. This is called a “pipeline”, and the individual shell commands in a pipeline are called “filters”. This command prints out the names of all the files and directories that make up the bulk of Ubuntu, then lets them be read a screenful at a time [find /usr -print | more] How many files and directories are there in /usr? [find /usr -print | wc -l][Explain each part of this]
Shell commands can do input and output to files, and the symbols to indicate this are less than and greater than, rather than the vertical bar or pipe. [more some-dir/unsorted ; sort < some-dir/unsorted ; sort < some-dir/unsorted > sorted ; more sorted] One implication of this is that you should never to write a sort routine yourself. There's a great one on the system that handles all sorts of special cases gracefully, and the performance is excellent.
The individual directories and filenames in a pathnames are separated by a forward slash (/). If you begin a filename with a character other than /, such as some-dir/unsorted, you're referring to the file in terms relative to your current working directory. This is known as a relative path name. On the other hand, if you begin a file name with a /, the system interprets this as a full path name -- that is, a path name that includes the entire path to the file, starting from the root directory, /. This is known as an absolute path name. Filenames on Linux are case sensitive.
You can write down a series of shell commands into a file, instead of typing them in each time. This is called a “shell script”. Look out, we're about to start programming. [Each of you do this: emacs count-my-files &][Explain each part of this]
[chmod +x count-my-files ; ./count-my-files]#!/bin/bash find ~ -type f -print | wc -l
We've reached the big idea:
Shell scripts are a computer programming language. Programming language interfaces are good for complicated or tedious operations, and for automating things that run without interacting with humans. The strongest aspect of Linux is that most applications are already Lego parts.
| Linux Command | Action |
|---|---|
| ls | what files are in this directory |
| cd | change directories |
| pwd | what directory am I in (also see prompt) |
| more, less | interactively read contents of file |
| ps auxww, top | what processes (programs) is the system running |
| kill | stop running processes |
| emacs | EMACS Makes All Computing Simple |
| rm | delete file or directory tree of files |
| grep | search contents of file |
| tail -f | watch the bottom of a growing file |
| find | search for files by name, date, etc. |
| cp | copy files |
| scp | copy files between machines |
| mkdir | make a directory |
| rmdir | delete an empty directory |
| mv | rename a file or move a directory |
| tar | pack a bunch of files into one file for transport |
| chmod | change permissions on files to make them shared or private |
| du | how much disk space is this using |
| diff, tkdiff | compare two text files and display only changes |
| gzip | compress file |
(This section adapted from http://www.yolinux.com/TUTORIALS/unix_for_dos_users.html)
| Windows Command | Linux Shell Command | Action |
|---|---|---|
| dir | ls -l (or use ls -lF)(-a all files)
(df -k Space remaining on filesystem) |
List directory contents |
| dir *.* /o-d dir *.* /v /os dir /s dir /aa |
ls -tr ls -ls ls -R ls -a |
List directory contents by reverse time of modification/creation. List files and size List directory/sub-directory contents recursively. List hidden files. |
| tree | ls -R | List directory recursively |
| cd | cd | Change directory |
| mkdir md |
mkdir | Make a new directory |
| rmdir rd |
rmdir | Remove a directory |
| chdir | pwd | Display directory location |
| del erase |
rm -iv | Remove a file |
| rmdir /S (NT) deltree (Win 95...) |
rm -R | Remove all directories and files below given directory |
| copy | cp -piv | Copy a file |
| xcopy | cp -R | Copy all file of directory recursivly |
| rename or move | mv -iv | Rename/move a file |
| type | cat | Dump contents of a file to users screen |
| more | more | Pipe output a single page at a time |
| help or command /? | man | Online manuals |
| find findstr |
grep | Look for a word in files given in command line |
| comp | diff | Compare two files and show differences. Also see comm and cmp. |
| fc | diff | Compare two files and show differences. Also see comm and cmp. |
| echo text | echo text | Echo text to screen |
| date or time | date | Show date, set date with permissions |
| sort | sort | Sort data alphabetically/numerically |
| edit filename.txt | emacs | Edit a file. |
| lpr | Print a file | |
| mem | free top |
Show free memory on system |
| tasklist (WIN2K, XP) | ps -aux top |
List executable name, process ID number and memory usage of active processes |
| Chdisk | du -s | Disk usage. |
| pkzip | tar and zip | Compress and uncompress files/directories. Use tar to create compilation of a directory before compressing. Linux also has compress, gzip |
Set Gnome text boxes, such as the Firefox URL input area, to accept emacs-style editing keys instead of Windows-style:
$ gconftool-2 --set /desktop/gnome/interface/gtk_key_theme Emacs --type string
This is searching all R programs you wrote for the word "median":
$ find ~ -type f -name "*.R" -print0 | xargs -0 grep -i "median"
Install packages in R:
#!/bin/bash -x
R --no-save << EOF
install.packages(c("akima"), repos="http://cran.us.r-project.org")
install.packages(c("assist"), repos="http://cran.us.r-project.org")
install.packages(c("brlr"), repos="http://cran.us.r-project.org")
# ...and so on for about 100 more packages...
quit()
EOF
This is moving aside your old Gnome window manager dot files so that new Gnome won't be confused by them:
#!/bin/bash -e
cd $HOME
FNAME=OLD-DOT-FILES.`date +'%Y-%m-%d_%H-%M-%S'`
mkdir $FNAME
find .[a-zA-Z]* -maxdepth 0 \( \! \( \
-name .ssh -o \
-name .bashrc -o \
-name .bash_profile -o \
-name .emacs -o \
-name .emacs.d -o \
-name .thunderbird -o \
-name .mozilla \) \) \
-print0 | xargs -0 --max-lines=1 --replace={} mv {} $FNAME
This example, for synchronizing files between multiple machines, demonstrates parsing command-line flags and arguments in detail.
#!/bin/bash -e trap ' SAVEDSTATUS=$? set +x if [ x$CLEANEXIT = x ] then echo "$0: ERROR Unexpected exit with return value of $SAVEDSTATUS" exit $SAVEDSTATUS fi' EXIT #----- TEMP=`getopt -o wab -n "$0" -- "$@"` if [ $? != 0 ] then echo "$0: ERROR running getopt" >&2 exit 1 fi eval set -- "$TEMP" WRITE='' ADDONLY='' while true do case "$1" in -w) WRITE=1 shift ;; -a) ADDONLY=1 shift ;; -b) BINARY=1 shift ;; --) shift break ;; *) echo "$0: ERROR parsing getopt" >&2 exit 1 ;; esac done #----- SOURCEPATH="$1" shift || true DESTPATH="$1" shift || true DESTHOST="$1" if [ -z "$SOURCEPATH" -o -z "$DESTPATH" -o -z "$DESTHOST" ] then echo "Usage: $0 [-wab] source-dir dest-dir target-host [target-host [...]]" 1>&2 CLEANEXIT=1 exit 1 fi if [ ! -e $SOURCEPATH ] then echo "$0: ERROR source path $SOURCEPATH does not exist, quitting" 1>&2 CLEANEXIT=1 exit 1 fi DISTOPTS="numchkgroup,numchkowner" if [ ! -z "$BINARY" ] then DISTOPTS="$DISTOPTS,compare" fi if [ -z "$WRITE" ] then MSG="READ-ONLY" DISTOPTS="$DISTOPTS,verify,younger" else MSG="WRITING" fi if [ -z "$ADDONLY" ] then DISTOPTS="$DISTOPTS,remove" fi #----- for DESTHOST do if ping -c 2 -w 3 $DESTHOST > /dev/null 2>&1 then echo "$MSG, HOST $DESTHOST" cat << EOF | \ /depot/rdist-6.1.5/bin/rdist \ -P /usr/bin/ssh \ -p /depot/rdist-6.1.5/bin/rdistd \ -f - ($SOURCEPATH) -> ($DESTHOST) install -o$DISTOPTS $DESTPATH ; EOF else echo "ERROR target-host '$DESTHOST' not pingable, skipping" 1>&2 fi echo "-----" done #----- CLEANEXIT=1 exit 0
Here's a template file to use when writing shell scripts. The trap statement emits an error if a command fails. Without the trap and the -e, it is easy for the script to quit early without you realizing it.
#!/bin/bash -e trap ' SAVEDSTATUS=$? set +x if [ x$CLEANEXIT = x ] then echo "$0: ERROR Unexpected exit with return value of $SAVEDSTATUS" exit $SAVEDSTATUS fi' EXIT #----- ### ### Your code goes here ### #----- CLEANEXIT=1 exit 0
Read man and info pages in emacs. Handy if you want to refer to several of them at once, and keep your place in each of them:$ man ls $ info ls
M-x man RET ls RET C-h i
Collection of Linux shell quick reference sheets: http://www.scottklarr.com/topic/115/linux-unix-cheat-sheets---the-ultimate-collection/.
His index of quick reference sheets is also good: http://www.scottklarr.com/topic/109/cheat-sheet-index/
How to run R in the background, with all the details right: http://www.stat.ufl.edu/system/man/R-background.html
Notes from a good shell scripting course that runs for three days, "Simple Shell Scripting for Scientists": http://www-uxsup.csx.cam.ac.uk/courses/ShellScriptingSci/
For more examples of scripting, see the GNU documentation on the core utilities, available online using:
Or here: http://www.stat.ufl.edu/system/software-toolbox.html$ info coreutils 'Opening the software toolbox'
The Linux Documentation Project: http://tldp.org
1. If you haven't already, set up your personal web page using the instructions in http://www.stat.ufl.edu/system/homepage.shtml. In this web area, create a very simple web page for a course. Use emacs to edit all the html.
2. Write a shell script based on the "find" command that prints the names of files in your home directory which are larger than 1 megabyte.
3. Write a shell script based on the "find" command that deletes the temporary files created by running latex. These files end in the extensions .aux, .dvi, and .log.
4. Write a shell script named "run-simulation", which could serve as a framework around future simulations you write in R. This script takes one parameter, which gives the number of runs to make. Each time the script runs, it creates a new directory to put its output in, with the name based on the current time, in the format "YYYY-MM-DD-HH-MM-SS". Each run's output goes in a file in that directory, named by the run number. When the script finishes, it sends you a notice by email. Your solution should have all of the features illustrated below:
When you do run R in a long-running simulation, note that you can save R objects to files from inside the middle of your program with the "save" command. Then, you can inspect those objects in files with another copy of R, to get an idea of how your simulation is progressing. It is very likely you will catch mistakes without waiting for the whole simulation run to complete.$ ./run-simulation 5 Created directory 2008-08-20-10-45-03 Running run 1 Running run 2 Running run 3 Running run 4 Running run 5 Simulations ended. $ ls 2008-08-20-10-45-03 1 2 3 4 5 $ more 2008-08-20-10-45-03/3 Started run for argument 3 3 times 3 is 9. Ended run for argument 3 $
| (C) University of Florida, Gainesville, FL 32611; (352) 392-1941. This page was last updated Fri Sep 5 13:46:41 EDT 2008. |
![]() |