SAS Manual: Depertment of Statistics Webpage (www.stat.ufl.edu) => Computing Environment => Online Software Documentation => SAS manuals (UF viewing only) =>SAS Macro Language: Reference,

Lesson 13: SAS Macro Language

SAS macros are subroutines that are convenient for other people to use. SAS macro is easy to write when the program involves mainly SAS procedures, but it is not easy to write if there are complicated computations that cannot be expressed by SAS procedures. In this case, SAS IML is much more appropriate. Let us start with the following simple example. The data are city, state the city is in, area in square miles, and population of 1980 and 1990 in thousands.

Example 1

 
%MACRO density(in=,p1=,p2=,del=, what=,out=);

DATA &out;SET ∈
IF &del=&what THEN delete;
density=p90/area;
z1=&p1+&p2;z2=&p1*area;  

%MEND density;

DATA n1;INPUT city $ state $ area p80 p90 @@;
CARDS;

Akron OH 62.1 237 223  Atlanta GA 131 425 394  
Dallas TX 333 905 1000 Denver CO 107 493 467   
NY NY 302 7072 7322    Memphis TN 264 646 610 
Tampa FL 58.7 82 125  Raleigh NC 83.4 150 208 

%density(in=n1,p1=2,p2=4.5,del=state,what='TX',out=n2);

DATA n3;SET n2;
PROC PRINT NOOBS;VAR city p90 area density Z1 Z2;                                       TITLE 'Output from MACRO density';
RUN;
In this program, density is the name of the ASA macro. The macro arguments can be a data set, a name, or a number. For example, when called, in=n1 is a data set, p1 and p2 are two numbers, del and what are names and out is another data set. Note the strange rule when quote names: a variable does not need 'xx' (del=state), but a name in a variable need 'xx ' (what='TX'). The macro starts with %MACRO and ends with %MEND. In side the macro are ASA statements. The values from the argument are represented with a suffix ampersand & sign. For example the first two statements DATA &out; SET ∈ mean that the names &out and &in are to be transferred from the call statement.

The main program at the line DATA n1 ends at RUN. The macro density is called by %density(...). The output is as follows.

           Output from MACRO density                       
                                          

 CITY        P90     AREA    DENSITY     Z1      Z2

 Akron       223     62.1     3.5910    6.5    124.2
 Atlanta     394    131.0     3.0076    6.5    262.0
 Denver      467    107.0     4.3645    6.5    214.0
 NY         7322    302.0    24.2450    6.5    604.0
 Memphis     610    264.0     2.3106    6.5    528.0
 Tampa       125     58.7     2.1295    6.5    117.4
 Raleigh     208     83.4     2.4940    6.5    166.8

The usual ASA operators such as IF, DO, THEN, ELSE can all be used in SAS macro except a suffix % sign has to be used (The suffix % may not needed after DATA statement, see Example 4 later. However, suffix is needed outside the DATA statement.) Here is a basic list of the SAS macro expressions (See Macro Expressions and Macro Language Elements (Macro Functions) under SAS Macro Language: Reference).
 
%LET defines a variable, e.g. %LET A = 1.5; 

%PUT statement: means print the statement in the LOG page. 
   The statement can contain both text and variables. e.g.,
   if &a=256 ,then
    %PUT The value is &a; 
   gives the output "The value is 256" in the LOG page. 

%EVAL (&a + 1) gives a new value &a+1. Note: This function 
   should be used only for integer arithmetic operations.

%IF %THEN; %ELSE;  

%DO ...; %END;   

%DO i = m %TO n; ...; %END; 

%DO %WHILE(&a < b); ...; %END;

%GOTO label;  Elsewhere in the program label: statement;

%LENGTH (a) = length of variable a, e.g. 
   %LENGTH (Gainesville) =11.

%STR(statement) = 'statement' as a string (not as a variable name).

%INDEX (a, b) = position in a for the first occurrence of 
   the string b, 
   e.g. %INDEX (xyzt, z) = 3, %INDEX(a long values, value) = 8;
   %INDEX (xyzt, a) = 0. 

%UPCASE (abc) = upper case ABC.  This is good for searching 
   when upper and lower case are mixed in the program.  For 
   example, the statement, %IF %UPCASE (&month) = DEC, is true 
   no matter &mouth was defined as  DEC, Dec, or dec. 

% SYMGET (a)= the name defined by a.  For example, %LET a =
   city. Then X = SYMGET (a) implies X = city. 
The operators are defined by the following table (See SAS Programs and Macro Processing).
Operator                   Name

+                          addition
-                          subtraction
*                          multiplication
/                          division
**                         exponent

< or LT                    less than
> or GT                    greater than
<= or LE                   less than or equal to
>= or GE                   greater than or equal to

& or AND                   logical and
| or OR                    logical or
~ or ^ or NOT              logical not

Here is one program that uses some of the above expressions or operations. The macro will do mean or regression according to the user's choice.

Example 2


%MACRO check(in=,p1=,y=,x=,out=);

DATA &out;SET ∈

  %IF %UPCASE(&p1) = MEANS %THEN %DO;
    PROC MEANS NOPRINT DATA=∈VAR &y &x;OUTPUT 
      OUT=&out MEAN=m1 m2 STD=s1 s2;
    TITLE 'PROC means was run.';RUN;
  %END;
  %ELSE %IF %UPCASE(&p1)=REG %THEN %DO; 
    PROC REG DATA=∈MODEL &y=&x;
    TITLE 'PROC GLM was run.';RUN;
  %END;
  %ELSE %PUT ERROR: There is nothing to run because p1=&p1.;
  
%MEND check;

DATA test1;INPUT a b; 
CARDS;
1 70
2 60
3 55
;
TITLE 'Example 2: CHOICE OF PROC MEANS AND REG ';
%check(in=test1, p1=means, y=a, x=b, out=abc);
PROC PRINT DATA=abc;VAR m1 m2 s1 s2; 
TITLE 'Example 2: CHOICE OF PROC MEANS AND REG ';
%check(in=test1, p1=reg, y=a, x=b, out=abc);
%check(in=test1, p1=corr, y=a, x=b, out=abc);

  
The output file is:
        PROC means was run.                          
                                    

  OBS    M1       M2      S1       S2

  1      2    61.6667     1    7.63763


        PROC GLM was run.

                          Analysis of Variance

                          Sum of         Mean
 Source          DF      Squares       Square      F Value       Prob>F

 Model            1      1.92857      1.92857       27.000       0.1210
 Error            1      0.07143      0.07143
 C Total          2      2.00000

........
And in the LOG file there is an error message (Use TITLE to put message in output page):

27   %check(in=test1, p1=corr, y=a, x=b, out=abc);

ERROR: There is nothing to run because p1=corr.
 
Now you know how those error or warning message in the LOG file were written. A check of the inputs with warning and error messages are helpful when your program is written for many users.

To use an existing macro, the %INCLUDE statement should be used to include the macro. If the macro is in the diskette, you use %INCLUDE 'A:macroname', and if it in a web site you use the web name. For example, I put the macro density of Example 1 as density.sas in my website. The following program will run with my macro. (Due to security reasons, the department may not allow retrievals of programs.)

Example 3

FILENAME a url 'http://www.stat.ufl.edu/~yang/STA5106/lesn13MACRO/';
%INCLUDE 'density.sas';

DATA newcity;INPUT city $ state $ area p80 p90 @@;
cards;
Boston MA 47.2 553 574 Macon GA 49.7 117 106
Orlando FL 65 128 165 Irving TX 67.4 110 155
;
%density(in=newcity,p1=0,p2=0,del=state,what='nothing',out=pdensity);
PROC PRINT DATA=pdensity;
RUN;
The first statement is to get to the macro's directory. See Lesson 2 Reading data from the internet.

Macros are specially useful for repeated data analysis such as the jackknife and the crossvalidation methods. In these methods, part of data are deleted to check the behavior of the rest data. It can be used to test the model sensitivity or the predictability of each datum or a group of data points. The following example show how to find jackknife correlation of a data set.

Example 4


%MACRO jacknife(indata=, size=);
  %LET j=1;
  %DO %WHILE(&j<=&size);
    DATA d_1;SET &indata;
    IF _N_=&j THEN DELETE;
    PROC CORR NOPRINT OUT=cdata;VAR x;WITH y;
   
    DATA coronly;SET cdata;
    IF _TYPE_ ^='CORR' THEN DELETE;
    PROC PRINT DATA=coronly;
    TITLE Observation &j is deleted.;

  %LET j=%EVAL(&j+1);
  %END;
%MEND;
 
DATA new;INPUT x y; cards;
1 2
2 4
5 6
6 1
7 9
8 8
;
%jacknife(indata=new,size=6);
RUN;
 
Note that in this macro, when the value j is defined, it does not have the suffix &. The & is used only when this number is used. It can be used even in the title text. The statements in the macro

 DATA coronly;SET cdata;
 IF _TYPE_ ^='CORR' THEN DELETE;

is based on the output data structure of cdata from PROC CORR. The output of PROC CORR contains a data set including mean, standard, sample size, and correlation in the _TYPE_ column. Since we want only the correlation, the IF statement is used. The output structure of a SAS procedure can only be ascertained in SAS manual. The output of this program is:
Observation 1 is deleted.                       
                                       
OBS    _TYPE_    _NAME_       X

1      CORR       Y       0.48048

Observation 2 is deleted.       

1      CORR       Y       0.63872

Observation 3 is deleted.                 
                  
1      CORR       Y       0.62618

Observation 4 is deleted.                  
                  
1      CORR       Y       0.96190

Observation 5 is deleted.                    
                      
1      CORR       Y       0.53334

Observation 6 is deleted.                

1      CORR       Y       0.49957

From this output, it is apparent that datum 4 has great impact to the overall correlation. It may be an outlier.


Homework problems for this lesson

Return to STA 5106 home page