UF Statistics Seminar Schedule

Seminars are held from 4:00 p.m. - 5:00 p.m. in Griffin-Floyd 100 unless otherwise noted.

Refreshments are available before the seminars from 3:30 p.m. - 4:00 p.m. in Griffin-Floyd Hall 201.

Fall 2017

Date Speaker

Title (click for abstract)

Sep 14 Miles Lopes (University of California, Davis)

Bootstrap Methods for High-Dimensional and Large-Scale Data

Sep 21 Gen Li (Columbia University)

A general framework for the association analysis of heterogeneous data

Oct 3 Hira Koul (Michigan State University)

Goodness-of-fit Testing of Error Distribution in Linear Measurement Error Models

Oct 12 Galin Jones (University of Minnesota)

Bayesian Penalized Regression (and a little MCMC)

Oct 26 Mariana Pensky (University of Central Florida)

TBA

Nov 2 Jason Roy (University of Pennsylvania)

TBA

Nov 16 Lifeng Lin (Florida State University)

Assessing publication bias in meta-analysis

Nov 30 Rebecca Steorts (Duke University)

TBA

Abstracts


Bootstrap Methods for High-Dimensional and Large-Scale Data

Miles Lopes University of California, Davis

Bootstrap methods are among the most broadly applicable tools for statistical inference and uncertainty quantification. Although these methods have an extensive literature, much remains to be understood about their applicability in modern settings, where observations are high-dimensional, or where the quantity of data outstrips computational resources. In this talk, I will present a couple of new bootstrap methods that are tailored to these settings. First, I will discuss the topic of "spectral statistics" arising from high-dimensional sample covariance matrices, and describe a method for approximating the laws of such statistics. Second, in the context of large-scale data, I will discuss a more unconventional application of the bootstrap -- dealing with the tradeoff between accuracy and computational cost for ensemble classifiers. More specifically, I will explain how the bootstrap can be used to decide when an ensemble of classifiers trained by bagging or random forests is sufficiently large. This will include joint work with Alexander Aue and Andrew Blandino.


A general framework for the association analysis of heterogeneous data

Gen Li Columbia University

Multivariate association analysis is of primary interest in many applications. Despite the prevalence of high-dimensional and non-Gaussian data (such as count-valued or binary), most existing methods only apply to low-dimensional datasets with continuous measurements. We develop a new framework for the association analysis of two sets of high-dimensional and heterogeneous (continuous/binary/count) data. We model heterogeneous random variables using exponential family distributions, and exploit a structured decomposition of the underlying natural parameter matrices to identify shared and individual patterns for two datasets. We also introduce a new measure of the strength of association, and a permutation-based procedure to test its significance. An alternating iteratively reweighted least squares algorithm is devised for model fitting, and several variants are developed to expedite computation and achieve variable selection. The application to the Computer Audition Lab 500-song (CAL500) music annotation study sheds light on the relationship between acoustic features and semantic annotations, and provides an effective means for automatic annotation and music retrieval.


Goodness-of-fit Testing of Error Distribution in Linear Measurement Error Models

Hira Koul Michigan State University

In this talk we shall discuss a class of goodness-of-fit tests for the error density function in linear measurement errors regression models using a deconvolution kernel density estimators of the regression model error density. The test statistic is an analog of the Bickel and Rosenblatt type test statistic. The asymptotic null distribution of the proposed test statistics is derived for both the ordinary smooth and super smooth cases. The consistency against a fixed alternative and the asymptotic power of the proposed tests against a class of local nonparametric alternatives are also obtained for both cases. A finite sample simulation study shows some superiority of the proposed test compared to very few other existing tests. Joint work with Weixing Song and Xiaoyu Zhu.


Bayesian Penalized Regression (and a little MCMC)

Galin Jones University of Minnesota

I will consider ordinary least squares, lasso, bridge, and ridge regression methods under a unified framework. The particular method is determined by the form of the penalty term, which is typically chosen by cross validation. The goal is to introduce a fully Bayesian approach which allows selection of the penalty through posterior inference if desired and discuss how to use a type of model averaging approach to eliminate the nuisance penalty parameters. Sufficient conditions for the posterior to concentrate near the true regression coefficients as the dimension grows with sample size will be discussed.

The resulting posterior is analytically intractable and requires a component-wise Markov chain Monte Carlo algorithm. The MCMC estimation problem is highly multivariate, an issue which has been largely ignored in the MCMC literature. A new relative-volume simulation termination rule will be introduced and connected to a new concept of effective sample size. This allows termination of the simulation in a principled manner.

Numerical results show that the proposed model and MCMC method tends to select the optimal penalty and performs well in both variable selection and prediction. Examples will be provided.
Meeting Schedule



Past Seminars

Spring 2017 Fall 2016
Spring 2016 Fall 2015 Fall 2014 Fall 2013
Spring 2013 Fall 2012 Spring 2012 Fall 2011
Spring 2011 Fall 2010 Spring 2010 Fall 2009
Spring 2009 Fall 2008 Spring 2008 Fall 2007
Spring 2007 Fall 2006 Spring 2006 Fall 2005
Spring 2005 Fall 2004 Spring 2004 Fall 2003
Spring 2003 Fall 2002 Spring 2002 Fall 2001
Spring 2001 Fall 2000 Spring 2000 Fall 1999