My primary research interests fall in a new area at the intersection of multivariate analysis and dimension reduction, called envelopes, which was introduced by Cook, Li and Chiaromonte (2010) under the framework of multivariate linear regression. It applies dimension reduction techniques to remove the immaterial information and achieve efficient estimation of the regression coefficients. The envelope method has broad application in many fields in multivariate analysis such as discriminant analysis, functional data analysis, reduced rank regression and sufficient dimension reduction. It can also be connected to other fields in statistics, including partial least squares, model selection, Bayesian statistics, experimental design, and time series.

Roughly speaking, my research has two directions. Firstly, I expand and promote the development of envelopes, making them more powerful in efficient estimation and adapting them to different kinds of datasets. For example,
Su and Cook (2010) developed a heteroscedastic envelope model to accommodate datasets with non-constant covariance structure. Currently, I am working on envelope models with small sample sizes so that envelopes can be applied in the n < p scenario. Secondly, I try to connect envelopes with other fields in statistics. For example, Cook, Helland and Su (2010) studied the relationship between partial least squares and envelope models. It is interesting that the envelope model can in fact explain the mechanism of partial least squares, which was originally designed as an ad-hoc algorithm in chemometrics. Now I am also working on a project to achieve efficiency and sparsity simultaneously in estimating the coefficients in multivariate linear regression.

More on Envelopes

The following plot illustrates the working mechanism of envelope models.

Suppose we are comparing two multivariate means. In both panels, the two ellipses represent two bivariate normal populations. The predictor X is an indicator variable taking values 0 or 1 to denote the different populations, Y1 and Y2 are two responses representing two characteristics of the populations. The left panel represents the analysis under the standard model. If we want to tell if there is the difference between the groups in the distribution of Y2, then it is hard to judge as the projection distributions on Y2 almost overlap with each other. However, under the envelope inference presented by the right panel, we first find a dimension reduction subspace that removes all the irrelevant information to the group difference and then project the data onto the Y2 axis. Now it is much easier to tell the difference between the two groups.

Some recent development of envelopes include
partial envelope, inner envelope, heteroscedastic envelope and scaled envelope.