http://www.stat.ufl.edu http://www.stat.ufl.edu

Introduction to Computing at UF Statistics

Intended audience

The audience for this document is the new graduate student or faculty member in Statistics. It is assumed you have been using Microsoft Windows to send email, browse the web, and write papers at this or another University for the last several years, and you want to know what additional software and services the Statistics department and UF have to offer. It is also assumed you do not have hands-on experience with Linux, and you wonder why we use it instead of Windows.

Phishing

Is this email real, or is it attempted fraud? How can you tell?

Hover your mouse over the link and look on the bottom line:

Does that URL point to somewhere authentic-looking within UF? Who is "megabyet.net"?

And how about this one:

Hover your mouse over the link and look on the bottom line. Does that URL point to Citibank?

If you get what seems to be a surprise email from your bank, TELEPHONE your bank to ask about it! Use a phone number you already trust, such as the phone number on the back of your ATM card or credit card.

Attitudes and techniques for not biting on a phishing hook

More techniques: Current active recognized phishes on campus: You gave out your financial info to a phisher, now what?

Central computing groups on campus

For those of you new to UF, I want to mention a few of the other large central computing groups on campus before I talk about Statistics resources. Some of these you will be working with daily, and some you may never directly talk to. All of them are important.

Academic Technology (AT; formerly OIR, CIRCA)

The group whose name was formerly CIRCA (Center for Instructional, Research, and Computing Activities) merged with the group whose name was formerly OIR (Office of Instructional Resources), to form the group named AT (Academic Technology). Currently they run several Windows-based computing labs, the undergraduate computing helpdesk, the Gatorlink helpdesk, the first-level PeopleSoft helpdesk, and a bunch of other things. If you are new to UF you will need to visit them on the 5th floor of the CSE (Computer Science and Engineering) building to receive a Gatorlink identification card. You should do this in your first week. You have already received your Gatorlink account in the process of spplying to UF.

Computing and Network Services (CNS; formerly NERDC)

The group whose name was formerly NERDC (Northeast Regional Data Center) has been renamed CNS (Computing and Network Services). They maintain the network plumbing that brings the Internet to every building, shed, and doghouse on campus, and run Gatorlink mail and dialup. They probe every Windows machine they can find for viruses from time to time, and this includes personal laptops attached by wireless. You may never interact with them directly unless your personal Windows machine contracts a particularly malevolent virus.

Bridges / Peoplesoft (formerly ERP)

The group whose name was formerly ERP (Enterprise Resource Planning) and is now named Bridges, runs the ERP software named PeopleSoft on several IBM Unix mainframes. Currently PeopleSoft has replaced the financial functions of general ledger and payroll that were formerly distributed between the CNS mainframe and the State of Florida mainframe. It appears like the registrar function will remain on the IBM CICS mainframe. You will be interacting with PeopleSoft for time reporting for some appointments.

Central computing services on campus

Gatorlink Email

Gatorlink email is a web-based email service with about 50,000 users. Repeated unreasonable blacklisting of Gatorlink by AOL for a small percentage of spam has caused the University to ask that your UF email address of record, the one listed in directory.ufl.edu, not be forwarded off-campus. Gatorlink is a legitimate final destination.

Wireless

There is wireless service inside of Griffin-Floyd, and all over campus. The general access ones are all named “uf”, just connect to the one with the highest signal strength. Open a browser and read any web page. You will be redirected to a login screen that accepts your Gatorlink to activate the port. Make sure you know the name of the wireless you're trying to use. If you're inside Griffin-Floyd, you may receive the uf wireless from the plaza next door, which has too low of a signal to be reliable, or the Chemistry one named Chemnet, which you don't have logins for.

Walk-up Internet ports

There are some ethernet ports, mostly in AT lab spaces, that are available for campus community use. Plug in and open a browser the same way as wireless.

Personal laptops, viruses, and network connections

In the past, we've found that 5 of the 25 laptops newly introduced to the Statistics network in a year were caught breaking into other machines within a week, even after passing an initial virus scan. Therefore, personal laptops may now only be connected via the wireless service, which is better equipped to identify and isolate infections. The McAfee virus scanner is site-licensed to UF for both school and home use. If you receive a paycheck from UF, you are also site licensed to use any version of Windows and any version of MS Office. If you are using old versions, please don't hesitate to upgrade to a newer version that receives more comprehensive security patches. The McAfee is available through http://software.ufl.edu, and you can purchase media for Windows and Office versions at the bookstore in the Reitz Union.

Email

Your email address is yourlogin@stat.ufl.edu, and that is where official departmental notices will be sent. We don't mind if you forward your email elsewhere, as long as you read it! All incoming messages are eventually accepted, but messages deemed to be spam are filtered into the folders Spam, Virus, or Banned. Messages in these folders older than two weeks are automatically deleted. You can ignore these folders until you think you may be missing something. We believe we get extremely few false positives on our spam filtering.

You can access your Statistics department mail several ways. The majority of Statistics users read their mail through Thunderbird, either from the lab or office systems, their own laptop, or their own machine at home. Users on the road often read their mail through http://webmail.stat.ufl.edu. Instructions to set up your own Thunderbird or Mozilla to interact with our mail server are at http://www.stat.ufl.edu/system/imap.bb.shtml.

Personal web page

Instructions for setting up a personal web page are at http://www.stat.ufl.edu/system/homepage.shtml One problem that comes up occasionally with TAs is that for privacy reasons you must never publish Social Security numbers or UFID numbers on the web. I know it's convenient to return test results that way, but you must not do it.

How is Linux different from Windows

At the moment you're probably using Windows, and it works for you, so what's the big deal with Linux? Below are some computing concepts which you probably haven't been exposed to in the Windows world. My goal is to expand and improve your computer experience.

Reliability and security

For practical purposes, Linux never catches viruses, and doesn't spend most of its lifetime being twitchy and slow. Once Linux is customized to the the quirks of particular hardware, it will run well on it indefinitely. You don't need to reload Linux every six months to get your performance back. Also, it is common to stay logged into a Linux machine for months at a time. This is a “desktop” experience much like a physical desk. You can interrupt work in progress and expect to be able to go back to it later. The sight of the arrangement you left it in will jog your memory of what you were doing. If you're doing this on a laptop, your suspend and restore feature must be rock solid.

On all the time

Linux is on all the time, and as long as it's on, it may as well be doing work for you. What sort of computations could you do with 35 machines that are on all the time? Linux is multiuser, so your use of a machine doesn't prevent anyone else from using it, as long as you exercise good taste about resource consumption.

Easy programming: the shell

Windows generally has exactly one way to control a program: the GUI (Graphical User Interface). By contrast, Linux usually has one, the shell command line, and occasionally a second, a GUI. For instance, the shell command wget can copy an entire web tree in a single operation, which is complimentary to the interactive work you can do in a browser. Linux programs are designed as Lego building blocks that you snap together. Programming Linux starts off easy, with scripting. You put a couple commands with long arguments in a file, so you don't have to keep typing them in. Perhaps these commands sync portions of your laptop disk to your Statistics account for backup purposes. Then you add a couple more commands, perhaps to keep dated backup versions. Soon you're outside of the range of what you could achieve on Windows. The shell is what lets you plug together 2,400 little building-block programs. You will soon know the names of 50 of them corresponding to things like dir and copy from Windows, and every program you can use as a “command” is one you can script. The Linux statistical computing environments, such as R, leverage rather than attempt to replace these tools.

Feature stability

I have been using essentially the same editor (emacs) and email program (mh-e) for 20 years. That's before the web, and five years before Microsoft Windows 3.0. Imagine the payback on my investment in learning those programs. How many different Windows editors and email programs have you had to learn in that time?

X-Windows remote display

Linux has the concept that where you run a program can be different from where you interact with the program. You can run your big crunchy R program on the meaty server on the closet, but you don't have to physically sit in the closet in front of the console. Instead, you can send the GUI back to your desk.

Mathematical typesetting

The world standard for mathematical typesetting is the TeX (pronounced “tech”, spelled “tex”) family of programs. Of these the frontrunner is LaTeX (pronounced “la-tech”, spelled “latex”). With improvements to latex in the last decade you are not limited to the classic latex style. Like HTML, Latex is an example of the “markup” style of typesetting systems. In a markup system, the text of the document is interspersed with declarations about what purposes the pieces of text serve in the document, and to a lesser extent what the text should look like.

Latex has been improved and simplified quite a bit in the last decade. There's a whole lot of latex code floating around that is more cumbersome than it needs to be now, so latex is an exception to the general rule of programming that you should learn by copying code from other programs. Save yourself hassle and write your latex from scratch with a good reference book like Kopka. However, there is dedicated University support for writing your thesis in latex, which you should not do from scratch. Ask others when you are ready.

Word Processing

Word processing for small things with no mathematics like a letter to a potential employer can be done perfectly well in latex. However, if you're doing something that's mostly graphics design like a sign, you might find one of the Adobe Illustrator or Visio clones a better match for the problem. Try out inkscape, dia, and xfig. For the Adobe Photoshop kind of bitmap editing, use gimp. If you're graphing data, use R.

Latex example

A letter on the departmental letterhead looks like this:
\documentclass[12pt]{letter}

\usepackage{epsf}
\usepackage{myltr}
\usepackage{times}

\begin{document}

\date{August 19, 2010}

\signature{My Name}
\address{}

\begin{letter}{}

\opening{Dear Mom,}

Word processing in Latex was easier than I had feared.  Now I can do
all of my professional work in latex on Linux!

\closing{Love,}
\end{letter}

\end{document}
This letter depends on the file myltr.sty, wherein I have customized my name and address into the letterhead. However, if you remove the usepackage for myltr, it will work standalone, sans letterhead.

Microsoft Office compatibility

Microsoft Office files can be read by Open Office. As of August 2010, Open Office's compatibility is good enough to read Office documents, but not good enough to reliably collaborate with Office users. The shell command to start up the Open Office suite is ooffice, or oocalc, oodraw, oowriter, and ooimpress (PowerPoint clone) for the individual pieces.

Statistical Software

R

R is a free software clone of S. It has collected lots of free software development interest. It is a serious program which is attracting careers of professional work. We recommend it due to the open source benefits of longevity and control. It runs nicely inside emacs. If you implement your statistical research results in R and publicize them, it is much more likely that people will use them.

Example of R

The shell command is R. A session looks like this:
bb@mako:~$ R

R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> x <- rnorm(50)
> y <- 2*x+rnorm(50, sd=0.4)
> plot(x,y)
> q()
Save workspace image? [y/n/c]: n
bb@mako:~$
After the plot command, a window appears showing the plot. The “Save workspace image?” question wants to save all the variables and functions you've created within R's memory into a file, so that you can read them all back in and pick up where you left off in a later R session. Most R users I've spoken to tend not to use saved environments, instead preferring to keep their data and functions in ordinary text files where they can “put their hands on them”.

Specific commands

Windows equivalents

WindowsUnix
dirls
cdcd
copycp
renamemv
delrm
pwdpwd
moremore
printlpr
helpman

File viewers

PDFxpdf, acroread
PostScriptgv
GIF, JPEGxloadimage, gimp


(C) University of Florida, Gainesville, FL 32611; (352) 392-1941.
This page was last updated Tue Sep 25 01:14:34 EDT 2012
http://www.ufl.edu