The SAS material can be found from http://www.stat.ufl.edu/~yang/STA5106/

The department of Statistics website has a online manual SAS manuals (UF viewing only) at

http://www.stat.ufl.edu/system/man/

The material in this lesson can be found in SAS Language Reference: Concepts; Rules for Words and Names, Data Set Options, SAS Output.

Lesson 1: Input Data by Hand

SAS is a comprehensive system which integrates utilities for storing, modifying, analyzing, and graphing data. Learning to write SAS programs is difficult at first, but doing so will enable you to perform the specific tasks which are most appropriate for your data. Unlike other packages which are easier to use, SAS offers the flexibility for you to make necessary modifications to standard analytic techniques or to select from several possible ways to analyze the same data.

In this lesson, you will write a simple SAS program in which data will be entered into the computer and printed. The instructions below are appropriate most SAS Versions, but for some computers, though all of the SAS statements will remain the same, there will be some differences in I/O or operstional instructions.

If you use PC SAS, refer to Get started with PC SAS in the title page. 

The following example will show you how to enter data into SAS and obtain a printout. The data were obtained from the Citrus Production Forecast Page provided by the Florida Agricultural Statistics Service . The data represent October 1997 estimates of yields of early-season oranges (e.g. navel) and late-season oranges (e.g. Valencia) in four U. S. states. Crop estimates are in units of millions of boxes. The SAS dataset will have three variables (state, early-season yield estimates, and late-season yield estimates) and four observations, where each observation corresponds to one state.

On any text editor , you write your SAS program. Here is a very simple one:

DATA oranges;
INPUT state $ 1-10 early 12-14 late 16-18;
DATALINES;
Florida    130  90
California  37  26
Texas      1.3 .15
Arizona    .65 .85
;
PROC PRINT DATA=oranges;
RUN;

Let us start with the first line of the Program,

DATA oranges;
The word DATA instructs SAS that you are going to provide data in following steps. SAS has two main categories of operations. Data steps are used to input and modify datasets. Procedure steps, or PROCs, perform operations on the data. The word oranges gives a name to the dataset that will be created. Any name can be assigned to a dataset, as long as it follows these rules: Finally, the statement ends with a semicolon. This tells SAS that the current instruction has ended and to proceed with the next instruction. With a few exceptions, semicolons must be placed at the end of each statement in SAS. Many errors in SAS are caused by forgetting the semicolon.

In the next line, add the following:

INPUT state $ 1-10 early 12-14 late 16-18;
Three variables are named in this statement: STATE, EARLY, and LATE. The rules for naming variables are the same as the rules for naming datasets. The numbers refer to positions of the variables. For example, the data for STATE will be listed beginning in the leftmost column 1 and using up to 10 characters. The $ specifies that STATE should be regarded as a character variable and not a numeric variable. By default, SAS assumes that all variables are numeric.

The next line is:

DATALINES;
This indicates that the actual data will appear next. In most SAS references, the statement
CARDS;
is used instead. This originally referred to the punch cards that were used to input data in the early days of computer programming. The two commands are equivalent, and SAS will currently accept both commands.

Next, enter the data. An optional single semicolon may put at the end of data with a line by itself after the last line of data.

Florida    130  90
California  37  26
Texas      1.3 .15
Arizona    .65 .85
;
Now, add the instructions for SAS to print the data. One of the procedures in SAS, PROC PRINT, can be used to provide a simple printout of the data.
PROC PRINT DATA=oranges;
Finally, end with a RUN statement. This tells SAS to process the data and provide the information you have requested.
RUN;
The SAS program you have written in the Program (or any text) Editor window is now completed as we have shown before:
DATA oranges;
INPUT state $ 1-10 early 12-14 late 16-18;
DATALINES;
Florida    130  90
California  37  26
Texas      1.3 .15
Arizona    .65 .85
;
PROC PRINT DATA=oranges;
RUN;
You need to give this program a name with .sas. For example, we name this program as first.sas. In the Department of Statistics computer system, get into xterm window and type

>sas first.sas, or > sas first

The program will be run with two outputs: first.log and first.lst. The log file is for compiler comments and the list file is for the output. The latter file may not appear due to error.
The log file looks like the following.


NOTE: Copyright (c) 1989-1996 by SAS Institute Inc., Cary, NC,
USA.
NOTE: SAS (r) Proprietary Software Release 6.12  TS020
      Licensed to UNIVERSITY OF FLORIDA, Site 0009337001.
1   DATA oranges;
2   INPUT state $ 1-10 early 12-14 late 16-18;
3   DATALINES;
NOTE: The data set WORK.ORANGES has 4 observations and 3
variables.
NOTE: The DATA statement used 0.81 seconds.
8   ;
9   PROC PRINT DATA=oranges;
10  RUN;
NOTE: The PROCEDURE PRINT used 0.05 seconds.
The list file should appear as follows.

               OBS    STATE          EARLY     LATE

                1     Florida       130.00    90.00
                2     California     37.00    26.00
                3     Texas           1.30     0.15
                4     Arizona         0.65     0.85

In this listing, SAS added OBS, the observation number. SAS also aligned the decimal points on the two numeric variables.

If your program did not work correctly, use the statements in the LOG window to see if you can trace your mistakes. Also, if you see "ERROR" note in the LOG page, you should pay attention even when the output seems correct. 

In this lesson, you learned the mechanics of writing and running a simple SAS program. Future lessons will show you how to manipulate and analyze data.


Homework problems for this lesson

Return to STA 5106 home page