Seminars are held on Thursdays from 4:00 p.m. - 5:00 p.m. in Griffin-Floyd 100 unless otherwise noted.
Refreshments are available before the seminars from 3:30 p.m. - 4:00 p.m. in Griffin-Floyd Hall 230.
|Jan 22 (Tue)||Dawn Woodard (Duke)|
|Feb 7||Guilherme Rocha (Berkeley)||Designing Penalty Functions for Grouped and Hierarchical Selection|
|Feb 12 (Tue)||Omar De la Cruz (Chicago)||A geometric approach in the analysis of biological data|
|Feb 14||Bodhisattva Sen (U Mich)||Bootstrap in some Non-standard Problems|
|Feb 28||Lie Wang (U Penn)||A Difference Based Method in Nonparametric Function Estimation|
|Mar 27||Xiaogang Su (UCF)|
|Apr 3||Sanat Sarkar (Temple)||A Review of Results on False Discovery Rate|
|Apr 11 (Fri)||Arthur Berg (UF)||Nonparametric Estimation at a Semiparametric Rate||UF/FSU Colloquium (in Tallahassee)|
|Apr 17||Alan Agresti (UF)||Pseudo-Score Confidence Intervals for Categorical Data Analyses|
|Convergence of Parallel and Simulated Tempering on Multimodal Distributions|
Dawn Woodward (Duke)
Sampling methods such as Markov chain Monte Carlo are ubiquitous in Bayesian statistics, statistical mechanics, and theoretical computer science. However, when the distribution being sampled is multimodal many of these techniques require long running times to obtain reliable answers. In statistics, multimodal posterior distributions arise in model selection problems, mixture models, and nonparametric models among others. Parallel and simulated tempering (PT and ST) are Markov chain methods that are designed to sample efficiently from multimodal distributions; we address the extent to which this is achieved.
We obtain general bounds on the convergence rate of PT and ST. We then use these bounds to evaluate the running time of PT and ST as a function of the parameter dimension, for multimodal examples including several normal mixture and discrete Markov random field distributions. We categorize the distributions into those for which PT and ST are rapidly mixing, meaning that the running time increases polynomially in the parameter dimension, and those for which PT and ST are torpidly mixing, meaning that the running time increases exponentially.
|Designing Penalty Functions for Grouped and Hierarchical Selection|
Guilherme Rocha (Berkeley)
More recently, penalization by the L1-norm (lasso) has enjoyed a lot of attention. L1-penalized estimates are cheaper to compute (convex optimization) and lead to more stable model estimates than their L0 counterparts.
In this talk, I will present the Composite Absolute Penalties (CAP) family of penalties. CAP penalties allow given grouping and hierarchical relationships between the predictors to be expressed. They are built by defining groups of variables and combining the properties of norm penalties at the across group and within group levels. Grouped selection occurs for non-overlapping groups. Hierarchical variable selection is reached by defining groups with particular overlapping patterns.
Under easily verifiable assumptions, CAP penalties are convex: an attractive property from a computational stand-point. Within this subfamily, unbiased estimates of the degrees of freedom (df) exist so the regularization parameter is selected without cross-validation.
Simulation results show that CAP improves on the predictive performance of the LASSO for cases with p>>n and mis-specified groupings.
This is joint work with Peng Zhao and Bin Yu.
|A geometric approach in the analysis of biological data|
Omar De la Cruz (Chicago)
This approach leads in a natural way to the problem of manifold learning. We will present a new approach to this problem, and describe a way to integrate manifold learning with more traditional statistical procedures, like regression, as a way to obtain more easily interpretable inferences. In particular, after learning the geometry of a manifold underlying the overall distribution of the data, one can use that manifold as a "generalized predictor" in regression analyses.
This is useful in at least two ways: First, it makes it possible to establish which coordinates contribute significantly to the geometric structure (in the cell cycle case, one can establish or verify the annotation of genes as cycle regulated). Second, it makes it possible to adjust for the influence of the geometric structure, in order to more accurately measure other properties of the individuals. For example, in studies of gene expression in single cells, adjusting for the cell cycle allows a more accurate estimation of other characteristic (e.g., finding cell subpopulations). In our second example, we show how this can be used to adjust for population structure in genetic association studies, as a non-linear generalization of the approach based on principal components.
|Bootstrap in some Non-standard Problems|
Bodhisattva Sen (U Mich)
The talk will consider some issues with the consistency of different bootstrap methods for constructing confidence intervals in two non-standard problems characterized by shape restricted estimation. The study of consistency of bootstrap methods in these problems is motivated by the problem of estimating dark matter distribution in Astronomy.
The Grenander estimator, the nonparametric maximum likelihood estimator of an unknown non-increasing density function f on [0, ∞), is a prototypical example of a class of shape constrained estimators that converge at rate cube-root n. We focus on this example and illustrate different approaches of constructing confidence intervals for f(t0), for 0 < t0 < ∞. It is claimed that the bootstrap estimate of the sampling distribution of the Grenander estimator, when generating bootstrap samples from the empirical distribution function (e.d.f.) or its least concave majorant (the maximum likelihood estimate), does not have any weak limit, conditional on the data, in probability.
The other problem arises in Astronomy and is similar to the Wicksell's Corpuscle problem (1925, Biometrika). We observe (X1, X2), the first two co-ordinates of a three dimensional spherically symmetric random vector (X1, X2, X3). Interest focuses on estimating F, the distribution function of X12 + X22 + X32. This gives rise to an inverse problem with missing data. We propose two estimators of F and derive their limit distributions. Although the normalized estimators of F converge to a normal distribution, the non-standard asymptotics involved with the non-standard rate of convergence (n / log n)1/2, cast doubt on the consistency of bootstrap methods. We focus on bootstrapping from the e.d.f. of data, and show that the estimates can be bootstrapped consistently. A comparison of the two examples sheds light on some of the reasons for the (in)-consistency of bootstrap methods.
|A Difference Based Method in Nonparametric Function Estimation|
Lie Wang (U Penn)
Variance function estimation and semiparametric regression are important problems in many contexts with a wide range of applications. In this talk I will present some new results on these two problems. A consistent theme is the use of a difference based method. I will begin with a minimax analysis of the variance function estimation in heteroscedastic nonparametric regression. The results indicate that, contrary to the common practice, it is often not desirable to base the estimator of the variance function on the residuals from an optimal estimator of the mean. Instead it is desirable to use estimators of the mean with minimal bias. The results also correct the optimal rate claimed in Hall and Carroll (1989, JRSSB). I will then consider adaptive estimation of the variance function using a wavelet thresholding approach. A data-driven estimator is constructed by applying wavelet thresholding to the squared first-order differences of the observations. The variance function estimator is shown to be nearly optimally adaptive to the smoothness of both the mean and variance functions. Finally I will discuss a difference based procedure for semiparametric partial linear models. The estimation procedure is optimal in the sense that the estimator of the linear component is asymptotically efficient and the estimator of the nonparametric component is minimax rate optimal. Some numerical results will also be discussed.
|Subgroup Analysis via Recursive Partitioning|
Xiaogang Su (UCF)
Subgroup analysis is an integral part of comparative analysis such as clinical trials. Its goal is to determine whether and how the effect of an investigational treatment varies across subpopulations. We propose an interaction tree (IT) procedure to help delineate the heterogeneity structure for the treatment effect. The proposed method automatically facilitates a number of objectively defined subgroups, in some of which the treatment effect may be found prominent while in others the treatment may have a negligible or even negative effect. We follow the standard CART (Breiman et al., 1984) methodology to construct trees. Important effect-modifiers for the treatment are also extracted via random forests of interaction trees. Both simulated experiments and an example on pay gap assessment between women and men are provided for illustration.
|A Review of Results on False Discovery Rate|
Sanat Sarkar (Temple)
Multiple testing has become again a flourishing area of research due to its increased relevance in modern statistical investigations. As some of the notions of error used in traditional multiple testing procedures turn out to be too conservative while testing a large number of hypotheses, which is common in these investigations, alternative and more appropriate measures of error have been developed. Among these, the false discovery rate (FDR) and those related to it have received the most attention. In this talk, I will review results on the FDR.
|Nonparametric Estimation at a Semiparametric Rate|
Arthur Berg (UF)
At the core of this talk is the use of infinite-order kernels in a density estimation context with a specially tailored data-based bandwidth selection algorithm. In particular, we will focus on the estimation of the polyspectrum from Time Series and the hazard function from Survival Analysis. Additionally, improvement gained in terms of deficiency [Hodges and Lehmann, 1970] in smoothing the empirical distribution function and the Kaplan-Meier estimator will also be detailed. The talk will be peppered with a familiar group representation related to the symmetries of the polyspectrum.
|Pseudo-Score Confidence Intervals for Categorical Data Analyses|
Alan Agresti (UF)
This talk surveys confidence intervals that result from inverting score or pseudo-score tests for parameters summarizing categorical data. Such methods perform well, usually better than inverting Wald or likelihood-ratio tests. For some models ordinary score inferences are impractical, such as when the likelihood function is not an explicit function of the model parameters. For such cases, we propose pseudo-score inference based on a Pearson-type chi-squared statistic that compares fitted values for a working model with fitted values of the model when a parameter of interest takes a fixed value. For multinomial models, this interval simplifies to the large-sample score interval when the model is saturated but otherwise can be much simpler to construct. Possible generalizations of the method include a quasi-likelihood approach for discrete data. For small samples, `exact' methods are conservative inferentially, but inverting a score test using the mid-P value provides a sensible compromise. Finally, we briefly review a different pseudo-score approach that approximates the score interval for proportions and their differences with independent or dependent samples by adding pseudo data before forming simple Wald confidence intervals.
|Spring 2007||Fall 2006||Spring 2006||Fall 2005|
|Spring 2005||Fall 2004||Spring 2004||Fall 2003|
|Spring 2003||Fall 2002||Spring 2002||Fall 2001|
|Spring 2001||Fall 2000||Spring 2000||Fall 1999|