Assignments for Lesson 6

1. Refer to the HOCKEY data. Write a SAS program which calculates the number of games won, lost, and tied up to and including the current observation. Print the dataset with an appropriate format for the date. Don't forget to change the score of the final game to Boston College 5, Ohio State 2. The first few lines of output should be similar to this:



DATE     TEAM      CITY     STATE    OSU OPP W L T
10/10/97 Toronto   Columbus Ohio     5   0   1 0 0
10/18/97 Miami     Oxford   Ohio     0   3   1 1 0
10/24/97 Merrimack Columbus Ohio     2   7   1 2 0
10/26/97 Merrimack Columbus Ohio     5   3   2 2 0
10/31/97 Clarkson  Potsdam  New York 1   1   2 2 1

2. Refer to the RYAN data. Many of the ratings are missing. Suppose that a statistician wants to perform a procedure which does not allow missing values. He decides that one acceptable way to do the analysis is to replace each missing rating with the neutral rating (5). Write a SAS program which uses an array to replace all of the missing values for movie ratings with scores of five. Print the corrected dataset.

3. Suppose that your fifth-grader is learning how to write Roman numerals, and you want to help her or him by preparing a study guide. Write a SAS program which uses a DO loop to print the Arabic numbers 1, 2, 3, ..., 49, 50 AND their Roman equivalents. The ROMAN7. format in SAS will be helpful.

4. Suppose that you work with many data files which contain observations collected over time, and you would find it convenient to have a program which would print the most recent observations in those datasets. Examples include stock market prices and batting averages in baseball. Write a SAS macro in which the user supplies the name of a SAS dataset which contains a date variable called DATE. The macro should then print out only the 10 most recently-occurring values. In other words, print the 10 observations that have the largest values of DATE. In the printout, use an appropriate format for the date. Demonstrate that your macro works by creating SAS datasets for the HOCKEY and CLINTON data, with DATE representing the date variable in each dataset, then apply the macro to those two datasets.

5. Refer to the DOGS3 data. Write a SAS program which creates a dataset using an INFILE statement. Then, create a new dataset which contains three variables: the name of the dog, the week of measurement, and the eosinophil count in that week. There should be 75 observations in the new dataset. Print both datasets.

6. Refer to the GRADES data. Suppose that a total score of 60 is needed to earn a passing grade, and the instructor wants to know how many assignments each student turned in before achieving a total of at least 60 points. Write a SAS program which uses an array to calculate the minimum number of weeks needed for each student to earn 60 or more points. For example, a student who earned an 8 on each assignment would achieve at least 60 points in Week 8 (total=56 after 7 weeks, total=64 after 8 weeks). Print a listing of the students' identification numbers and the numbers of weeks needed to achieve a passing score. The first few lines of output should look similar to this:



STUDENT    WEEKS
  1105        9
  1294        9
  2009       10
  2341       10
  2354        9
  2761        9

7. Refer to the CLINTON data. Write a SAS program which reads the data. Using only the polls taken in the year 1998, create a new variable which indicates whether the percentage of people approving of the President's performance increased, decreased, or stayed the same from the time the last poll was taken. Also, create another variable which indicates the number of days elapsed from the previous poll to the current one. Print the dataset with the new variables.

Why would we want to do this? A statistician might suspect that the timing of the polls is dictated by current events and their effects on the President's popularity. We might want to see if the polls are conducted more frequently in times of crisis (such as the Oklahoma City bombing or the release of the Lewinsky videotape) in which the President plays a major role. This problem shows one way to provide data to investigate this idea.

8. Refer to the AIRPORTS data and create a SAS dataset from the text file. Then, create a new dataset with two observations and 21 variables. The two observations should correspond to the two years 1985 and 1995. One variable should indicate the year. The other 20 variables should be named CITY1 through CITY20 to represent the passenger totals at the 20 different airports represented in the data. Print both datasets.


Return to STA 5106 home page