Seminar Schedule
Seminars are held on Thursdays from 4:00 p.m.  5:00 p.m. in GriffinFloyd 100 unless otherwise noted.
Refreshments are available before the seminars from 3:30 p.m.  4:00 p.m. in GriffinFloyd Hall 230.
Fall 2008
Date  Speaker  Comments  

Sep 18  Robert Dorazio (USGS & UF)  
Sep 25  Minjung Kyung (UF)  
Oct 2  Guido Consonni (University of Pavia)  Tests Based on Intrinsic Priors for the Equality of two Correlated Proportions  
Oct 9  Montserrat Fuentes (NCSU)  
Oct 16  Aixin Tan (UF)  Practical CLT for MCMC Estimators  
Oct 28 (Tue)  Partha Lahiri (UMD)  
Nov 3 (Mon)  Nan Laird (Harvard)  Challis Lecture 

Nov 4 (Tue)  Nan Laird (Harvard)  Testing GeneEnvironment Interactions with Nuclear Family Data 
Challis Lecture Emerson Hall Classroom 
Nov 20  Robert Easterling (Sandia)  Marquardt Speakers Program 
Abstracts
Estimation of Manatee Abundance from Aerial Surveys Using Dual Observers and Removal Sampling 
Robert Dorazio (USGS & UF) Predictions of manatee abundance as a function of habitat characteristics are needed for making conservation decisions for this endangered species. The spatial distribution of manatees in southwest Florida is known to vary across the landscape with access to fresh water for drinking, food availability, warm water refuge, and other factors related to habitat. The spatial distribution of abundance also changes seasonally with changes in water temperature and the threat of mortality from cold stress. Consequently, aerial surveys were developed to estimate abundance of manatees in spatially referenced sample units using a combination of sampling protocols. Groups of manatees were detected using dual observers, and the number of manatees in each group were detected by repeated circling to yield a sequence of "removal" counts. Thus, both kinds of counts (i.e., those of groups and those of manatees within groups) were spatially referenced. A hierarchical modeling framework is developed to estimate maps of manatee abundance while accounting for the imperfect detectability of groups and of individuals within groups. A critical component of these models is the functional dependence between the probability of detecting a group and the group's size, which is unknown, but estimable.

Estimation in Dirichlet Random Effects Models 
Minjung Kyung (UF) We develop a new Gibbs sampler for a linear mixed model with a Dirichlet process random effect term, which is easily extended to a generalized linear mixed model with a probit link function. Our Gibbs sampler exploits the properties of the multinomial and Dirichlet distribution, and is shown to be an improvement, in terms of operator norm and efficiency, over other commonly used MCMC algorithms. We also investigate methods for the estimation of the concentration parameter of the Dirichlet process, finding that maximum likelihood may not be desirable, but a posterior mode is a reasonable approach. Examples are given to show how these models perform on real data. Our results complement both the theoretical basis of the Dirichlet process nonparametric prior and the computational work that has been done to date.

Tests Based on Intrinsic Priors for the Equality of two Correlated Proportions 
Guido Consonni (University of Pavia) Correlated proportions arise in longitudinal (panel) studies. A typical example is the "opinion swing'' problem: "Has the proportion of people favoring a politician changed after his recent speech to the nation on TV?''. Since the same group of individuals is interviewed before and after the speech, the two proportions are correlated. A natural null hypothesis to be tested is whether the corresponding population proportions are equal. A standard Bayesian approach to this problem has already been considered in the literature, based on a Dirichlet prior for the cellprobabilities of the underlying twobytwo table under the alternative hypothesis, together with an induced prior under the null. In lack of specific prior information, a diffuse (e.g. uniform) distribution may be used. We claim that this approach is not satisfactory, since in a testing problem one should make sure that the prior under the alternative be adequately centered around the region specified by the null, in order to obtain a fairer comparison between the two hypotheses, especially when the data are in reasonable agreement with the null. Following an intrinsic prior methodology, we develop two strategies for the construction of a collection of objective priors increasingly peaked around the null. We provide a simple interpretation of their structure in terms of weighted imaginary sample scenarios. We illustrate our method by means of three examples, carrying out sensitivity analysis and providing comparison with existing results. This is joint work with Luca La Rocca, University of Modena and Reggio Emilia, Italy. 
Neighborhood and Environmental Factors Associated with Physical Activity During Pregnancy 
Montserrat Fuentes (NCSU) Physical activity has welldocumented health benefits for cardiovascular fitness and weight control. For pregnant women, the American College of Obstetricians and Gynecologists currently recommends 30 minutes of moderate exercise on most, if not all, days; however, very few pregnant women achieve this level of activity. Epidemiologists, policy makers, and city planners are interested in whether characteristics of the physical environment in which women live and work have influence on physical activity levels during pregnancy and in the postpartum period. In this paper we study the associations between physical activity and several factors including personal characteristics, meteorological/air quality variables, and neighborhood characteristics in pregnant women in four counties of North Carolina. We simultaneously analyze six types of physical activity and investigate crossdependencies between these activity types. Exploratory analysis suggests that the associations are different in different regions. Therefore we use a multivariate regression model with spatiallyvarying regression coefficients. This model includes a regression parameter for each covariate at each spatial location. For our data with many predictors, some form of dimension reduction is clearly needed. We introduce a spatial Bayesian variable selection procedure to identify subsets of important variables. Our stochastic search algorithm determines the probabilities that each covariate's effect is null, nonnull but constant across space, and spatiallyvarying. Jointly with Brian Reich (NCSU) and Amy Herring (Biostatistics, UNC, Chapel
Hill)

Practical CLT for MCMC Estimators 
Aixin Tan (UF) In Bayesian inference, it is often of interest to inspect posteriors associated with several different priors. Consider, for example, the Bayesian oneway random effects model. Two different (improper) priors are frequently recommended: the standard diffuse prior and the reference prior. Call the resulting posteriors π and π*. Posterior expectations with respect to either of these posteriors are intractable. If the standard diffuse prior is adopted, there is a simple block Gibbs sampler that can be employed to explore π. As usual, empirical averages can be used to estimate intractable posterior expectations. We have developed a regenerationbased CLT that yields simple, asymptotically valid standard errors for the estimators. The posterior π* is more complicated. Thus, instead of constructing and studying a different MCMC algorithm for π*, we propose using importance sampling estimators based on the block Gibbs sampler for π. We have extended the existing regeneration theory to handle these importance sampling estimators and this yields simple, asymptotically valid standard errors for the importance sampling estimators. Successful application of our regeneration method rests on the assumption that the block Gibbs sampler for π converges to its stationary distribution at a geometric rate. We have proven that, unless the data set is extremely small and unbalanced, the block Gibbs Markov chain enjoys the desired property.

On parametric bootstrap methods in multilevel models with applications in small area estimation and related problems 
Partha Lahiri (UMD) There is a growing demand to assess socioeconomic and health status at both national and subnational levels from complex survey data. However, data availability at the subnational (small area) level from a survey is limited by cost and thus analysts must make the best possible use of all available information. Over the last three decades, multilevel models have been frequently used in a wide range of small area applications. Such models offer great flexibilities in combining information from various sources and thus are well suited for solving most small area estimation problems. To estimate small area characteristics, empirical best prediction (EBP) methods are routinely used. One major challenge for this approach is the estimation of an accurate mean squared prediction error (MSPE) of EBP that captures all sources of variations. Different parametric bootstrap methods have been proposed in the literature to solve this problem. But, the basic requirements of secondorder unbiasedness and nonnegativity of the MSPE estimator of an empirical best predictor (EBP) have led to different complex analytical adjustments to the naïve parametric bootstrap technique for small area estimation. In this talk, we show a way to recover the basic simplicity in the parametric bootstrap method, i.e. replacement of laborious analytical calculations by computeroriented simple techniques, without sacrificing the basic requirements in an MSPE estimator. The method works for a general class of multilevel models and different techniques of parameter estimation. The talk is based on my joint work with Professor S. Chatterjee, University of Minnesota. 
Missing Data Problems in Public Health 
Nan Laird (Harvard) Missing data arise commonly in many studies in Public Health and Medicine. Here I will
review some cases where statistical methodology has contributed to our ability to
overcome limitations in technology and sampling and produce good inferences with
imperfect data. We will discuss examples from estimation of radon levels from diffusion
battery data, longitudinal cohort studies, and testing for genetic effects with family data. 
Testing GeneEnvironment Interactions with Nuclear Family Data 
Nan Laird (Harvard) The widespread availability of genetic markers for samples of reasonable size has intensified interest in testing for geneenvironment interactions with complex diseases. Both traditional casecontrol and
familybased designs are used in genetic association studies, the latter having the advantage of
eliminating problems due to population substructure, as well as sensitivity to modeling the genetic effect
when testing for genetic effects alone. Here we address the issue of extending the family design to test
geneenvironment interactions. Robustness to population substructure can be maintained, but robustness
to model specification is not. We also discuss jointtests of geneenvironment interaction which are
generally more powerful, as well as completely robust to the genetic model and population substructure. 
PassionDriven Statistics 
Robert Easterling (Sandia) Archie Bunker once told his soninlaw, “Don’t give me no stastistics (sic), Meathead! I want facts!” What he was saying was: We statisticians get our kicks from stastistics (i.e., the technical aspects of statistical data analysis), while our clients and collaborators are turned on by the facts (the subjectmatter insights provided by data). We create Archie’s impression of a difference between statistics and facts early on by lifeless, sometimes clueless, textbook examples that seem aimed only at teaching formula plugin; there are no apparent or interesting facts either driving the investigation or revealed by the analysis. If we want people to be passionate (and intelligent) about the use of statistical methods in their work, we need, from the beginning, to connect their enthusiasm for their chosen fields to an appreciation of statistical methods that will help them learn more about their world. This importance of recognizing the importance of subjectmatter passion also pertains to our collaborative and consulting work in business and industry. Examples will be given along with an overview of the problem of statistical validation of computer models of physical processes and phenomena. 
Past Seminars
Spring 2008  Fall 2007  
Spring 2007  Fall 2006  Spring 2006  Fall 2005 
Spring 2005  Fall 2004  Spring 2004  Fall 2003 
Spring 2003  Fall 2002  Spring 2002  Fall 2001 
Spring 2001  Fall 2000  Spring 2000  Fall 1999 