The macros to be written in this exercises can be written in any way. You may use SAS IML or simply use SAS procedures.
1. It is well know that if A is a sysmetric matrix, then there exists an orthogonal matrix P such that P'AP=Diagonal matrix. To find this P is very easy using IML. Find the P for the following matrix and show that P is indeed orthogonal (P'P=I) and P'AP is diagonal.
1 2 3 4
2 7 6 2
3 6 1 5
4 2 5 3
2. A system a linear equations is called unstable if a small perturbation will change the solution drastically. Consider the following n equations.
3. Write a macro to solve the quadratic equation ax2+bx+c=0. The input should be an array of a b c from a data set and the output should be the two roots. Use the following four sets of a, b, c to test your macro.
a b c 1 4 3 1 4 5 0 2 5 0 0 6The output with the possibility of complex roots and one root or no root should look like the following.
A B C REAL1 IM1 REAL2 IM2 1 4 3 -1.0 . -3 . 1 4 5 -2.0 1 -2 -1 0 2 5 -2.5 . . . 0 0 6 . . . .
4. The program simpson.sas in datasets file is an IML subroutine to do numerical integration by simpson's method. First test the subroutine by showing that the integral value from -5 to 1.645 of a standard normal density function is 0.95. Next find the probability of two weighted chi-square random variables. More specifically, find
Pr{ chi22+5chi23 < 12.0}.
where chi2k denotes a chi-square random variable with k degrees of freedom.5. The program newton.sas in datasets file is an IML subroutine to solve f(x)=0 by Newton's method. Use that subroutine to solve
xe-x + x1/2 = 1.5.
Verify the solution by a calculator.6. Write a subroutine which will show whether the true mean is covered by a t confidence interval from n pieces of data. (i) Test this subroutine by the following simulation: Each time generate 10 standard normal random variables and see whether the t 95% confidence interval covers the true mean. Do this 1000 times and show that a 95% confidence interval indeed covers the true mean nearly 95% of the times. (ii) However, show that if the z-table is used for the t-table, i.e., use 1.96 as the upper 2.5 percentile, then the true coverage is quite different from 95%. (iii) Next generate your data from a uniform distribution and comment on the robustness of the t confidence interval against the non-normal data.
7. A trimmed mean is defined as the average by throwing away the highest and lowest alpha proportion of the data. For example, if alpha=0.05, then the highest and lowest 5% of the data will not be used in the mean estimation. This method protects contamination by outliers. Write a macro that can output the trimmed mean. The arguments of the macro are: the input data set, the variable, alpha, data size and the output file name. Use alpha=0.05 and the following data set to test your macro.
-10 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 200
8. Write a macro that can standardize a data set, i.e., each datum xi is modified by x'i=(xi-sample mean)/(sample standard deviation). Give an error message if the sample standard deviation is 0. Use input data two sets 1, 2, 3, 4, 5 and 2, 2, 2, 2 to test your macro.