Abstracts for 2012 Challis Lectures
by Ed George


General Lecture (Nov 29, 2012)

Discovering Regression Structure with a Bayesian Ensemble

A Bayesian ensemble can be used to discover and learn about the regression relationship between a variable of interest y, and vector of p potential predictor variables x.  The basic idea is to model the conditional distribution of y given x by a sum of random basis elements plus a flexible noise distribution.  In particular, I will focus on a Bayesian ensemble approach  called BART (Bayesian Additive Regression Trees).  Based on a basis of random regression trees, BART automatically produces a predictive distribution for y at any x (in or out of sample) which automatically adjusts for the uncertainty at each such x.   It can do this for nonlinear relationships, even those hidden within a large number of irrelevant predictors.   Further, BART opens up a novel approach for model free variable selection. Ultimately, the information provided such a Bayesian ensemble may be seen as a valuable first step towards model building for high dimensional data. (This is joint work with H. Chipman and R. McCulloch).

Technical Lecture (Nov 30, 2012)

EMVS:  The EM Approach to Bayesian Variable Selection

Despite rapid developments in stochastic search algorithms, the practicality of Bayesian variable selection methods has continued to pose challenges. High-dimensional data are now routinely analyzed, typically with many more covariates than observations.  To broaden the applicability of Bayesian variable selection for such high-dimensional linear regression contexts, we propose EMVS, a deterministic alternative to stochastic search based on an EM algorithm which exploits a conjugate mixture prior formulation to quickly find posterior modes. Combining a spike-and-slab regularization diagram for the discovery of active predictor sets with subsequent rigorous evaluation of posterior model probabilities, EMVS rapidly identifies promising sparse high posterior probability submodels.  External structural information such as likely covariate groupings or network topologies is easily incorporated into the EMVS framework.   Deterministic annealing variants are seen to improve the effectiveness of our algorithms by mitigating the posterior multi-modality associated with variable selection priors. The usefulness the EMVS approach is demonstrated on real high-dimensional data, where computational complexity renders stochastic search to be less practical.  (This is joint work with Veronika Rockova).