Abstract: Variable selection is fundamental in many high-dimensional statistical problems with sparsity structures. Much of the literature is based on optimization methods, where penalty terms are incorporated that yield both convex and non-convex optimization problems. In this talk, I will take a Bayesian point of view on high-dimensional regression, by placing a prior on the model space and performing the necessary integration so as to obtain a posterior distribution. In particular, I will show that a Bayesian approach can consistently select all relevant covariates under relatively mild conditions from a frequentist point of view.
Although Bayesian procedures for variable selection are provably effective and easy to implement, it has been suggested by many statisticians that Markov Chain Monte Carlo (MCMC) algorithms for sampling from its posterior need a long time to converge, as sampling from an exponentially large number of sub-models is an intrinsically hard problem. Surprisingly, our work shows that this plausible “exponentially many model” argument is misleading. By introducing a truncated sparsity prior for variable selection, we provide a set of conditions that guarantee the rapid mixing of a particular Metropolis-Hastings algorithm. The number of iterations for this Markov chain to reach stationarity is linear in the number of covariates up to a logarithmic factor.