Assignments for Lesson 14

 

The macros to be written in this exercises can be written in any way. You may use SAS IML or simply use SAS procedures.

1. It is well know that if A is a sysmetric matrix, then there exists an orthogonal matrix P such that P'AP=Diagonal matrix. To find this P is very easy using IML. Find the P for the following matrix and show that P is indeed orthogonal (P'P=I) and P'AP is diagonal.


    1   2   3   4
    2   7   6   2
    3   6   1   5
    4   2   5   3

2. A system a linear equations is called unstable if a small perturbation will change the solution drastically. Consider the following n equations.

To simplify the notation, we denote the above system as Ax=y. We wish to estimate the proportion of sensitive systems if the coefficients aij are randomly generated. Let all the coefficient aij be uniform (0,1) random numbers and all the y's are 1 and the solution be x0. Let the perturbation be an nxn matrix e in which each element of e is a 0.01x[(0,1) random number - 0.5]. Let the solution of (A+e)x=y be x1. We considered the system unstable if sum(abs(x1-x0))>1, where sum() and abs() are IML functions. Use 100 simulations to estimate the proportion of unstable systems when n=4.

3. Write a macro to solve the quadratic equation ax2+bx+c=0. The input should be an array of a b c from a data set and the output should be the two roots. Use the following four sets of a, b, c to test your macro.

a b c
1 4 3
1 4 5
0 2 5
0 0 6
The output with the possibility of complex roots and one root or no root should look like the following.
 A    B    C    REAL1    IM1    REAL2    IM2

 1    4    3     -1.0     .       -3      . 
 1    4    5     -2.0     1       -2     -1 
 0    2    5     -2.5     .        .      . 
 0    0    6       .      .        .      . 

4. The program   simpson.sas    in datasets file is an IML subroutine to do numerical integration by simpson's method. First test the subroutine by showing that the integral value from -5 to 1.645 of a standard normal density function is 0.95. Next find the probability of two weighted chi-square random variables. More specifically, find

Pr{ chi22+5chi23 < 12.0}.

where chi2k denotes a chi-square random variable with k degrees of freedom.

5. The program   newton.sas    in datasets file is an IML subroutine to solve f(x)=0 by Newton's method. Use that subroutine to solve

xe-x + x1/2 = 1.5.

Verify the solution by a calculator.

6. Write a subroutine which will show whether the true mean is covered by a t confidence interval from n pieces of data. (i) Test this subroutine by the following simulation: Each time generate 10 standard normal random variables and see whether the t 95% confidence interval covers the true mean. Do this 1000 times and show that a 95% confidence interval indeed covers the true mean nearly 95% of the times. (ii) However, show that if the z-table is used for the t-table, i.e., use 1.96 as the upper 2.5 percentile, then the true coverage is quite different from 95%. (iii) Next generate your data from a uniform distribution and comment on the robustness of the t confidence interval against the non-normal data.

7. A trimmed mean is defined as the average by throwing away the highest and lowest alpha proportion of the data. For example, if alpha=0.05, then the highest and lowest 5% of the data will not be used in the mean estimation. This method protects contamination by outliers. Write a macro that can output the trimmed mean. The arguments of the macro are: the input data set, the variable, alpha, data size and the output file name. Use alpha=0.05 and the following data set to test your macro.


-10 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200

8. Write a macro that can standardize a data set, i.e., each datum xi is modified by x'i=(xi-sample mean)/(sample standard deviation). Give an error message if the sample standard deviation is 0. Use input data two sets 1, 2, 3, 4, 5 and 2, 2, 2, 2 to test your macro.


Return to STA 5106 home page