| Honest Exploration of Intractable Probability Distributions Via Markov
Chain Monte Carlo |
|
Dr. Jim Hobert (Sep. 15)
Two important questions that must be answered whenever a Markov chain
Monte Carlo algorithm is used are (Q1) What is an appropriate burn-in?
and (Q2) How long should the sampling continue after burn-in? One
method of developing rigorous answers to these questions involves
establishing drift and minorization conditions, which together imply
that the underlying Markov chain is geometrically ergodic. In this
talk, I will explain what drift and minorization are as well as how
and why these conditions can be used to form rigorous answers to (Q1)
and (Q2).
schedule
|
| Title: Robustness of Estimators of Location to Distortion |
|
Demetris Athienitis (Sep. 22)
Robustness measures and procedures have traditionally been
developed and implemented through the contamination model. A model that
partially samples from a contaminant distribution. Hampel (1968),
developed a heuristic tool, the influence function, that represents the
rate of change in a statistic by adding an infinitesimal amount of
contamination. Multiple robustness properties have been derived using
this tool. We present a distortion model that takes a symmetric
probability density function and skews it in a specific direction thereby
distorting every realization of the distribution. The model used is based
upon the Fechner class of distributions. A plethora of statistical
procedures assume distributional symmetry and we set out to derive various
robustness properties for the distortion model, of which symmetry is only
a special case, by introducing the distortion sensitivity, an analog to
Hampel's influence function.
schedule
|
| Semiparametric Bayesian Inference for Phage Display Experiments |
|
Dr. Luis Leon (Sep. 29)
We address inference for a human phage display experiment with three
stages. The data are tripeptide counts by tissue and stage. The
primary aim of the experiment is to identify ligands that bind with
high affinity to a given tissue. We formalize the research question
as inference about the monotonicity of mean counts over stages. The
inference goal is then to identify a list of peptide-tissue pairs
with significant increase over stages. We develop a semi-parametric
model as a mixture of Poisson distributions with a Dirichlet process
prior on the mixing measure. The posterior distribution under this
model allows the desired inference about the monotonicity of mean
counts. However, the desired inference summary as a list
peptide-tissue pairs with significant increase involves a massive
multiplicity problem. We consider two alternative approaches to
address this multiplicity issue. First we propose an approach based
on the control of the posterior expected false discovery rate. We
notice that the implied solution ignores the relative size of the
increase. This motivates a second approach based on a utility
function that includes explicit weights for the size of the
increase..
schedule
|
| Introduction to Just Another Gibbs Sampler (JAGS) |
|
Rebecca Steorts (Oct. 13)
Bayesian inference often requires evaluating integrals that are intractable. This requires numerical methodology such as Markov Chain Monte Carlo to be used to approximate posterior distributions of interest. Often models will be very complex and advanced software will be needed. BUGS (Bayesian Inference Using Gibbs Sampling) is an example of such software that can be extremely useful in these situations. I will briefly explain the three BUGS programs available (WinBUGS, OpenBUGS, JAGS). I will then give a tutorial on how to use JAGS (Just Another Gibbs Sampler) and explain how to analyze output from JAGS in R.
schedule
|
| Bayesian Semiparametric Analysis of Case-Control Studies with
Longitudinally Varying Exposures. |
|
Dhiman Bhadra (Oct. 20)
Case Control studies mark the single most important contribution that
statisticians have made in the domain of Public Health and Epidemiology.
The fundamental principle of these studies is to compare a group of
subjects having a particular disease (cases) to a group of disease free
subjects (controls) with respect to some potential risk factors. In a
typical case control study design, the exposure information is collected
only once for the cases and controls. However, some recent medical studies
have indicated that a longitudinal approach of incorporating the entire
exposure history, when available, may lead to better estimates of odds
ratios. In this work, we have conducted an analysis of a case-control
study when longitudinal exposure information are available for both the
cases and controls. This has enableed us to analyze how the present
disease status of a subject is being influenced by past exposure
conditions conditional on the current ones. We have used semiparametric
regression procedures in modeling the exposure profiles. Analysis has been
carried out in a hierarchical Bayesian framework using MCMC procedures.
Based on our analysis, we can conclude that past exposure observations do
contribute significantly to our understanding of the present status of an
individual.
schedule
|
| Optimal Crawling of Websites |
|
Vik Gopal (Nov. 3)
Search engines need to maintain an index of websites that is as up-to-date as possible. Otherwise, search queries would be answered inaccurately. Search engines use web crawlers to visit websites in order to check if they have changed. If they have, then the new version is indexed. This raises the question, "At what time point should I return to a particular website?". Keeping in mind that there is also a cost to visiting websites too frequently, we try to answer this question in both a frequentist and Bayesian setting.
schedule
|
| Generalized Additive Models and their Implementation in R |
|
Kenny Lopiano (Nov. 17)
Additive models and generalized additive models(GAMs) are a very important tool for modeling a response variable Y as a function of several predictor variables X_1,X_2,...X_n that does not restrict the model to be linear in the parameters. With this in mind, we will provide brief overview of the existing methods associated with GAMs. We also provide an introduction to implementing such models using the R package mgcv.
schedule
|
| Comparing Spatial Predicted Surfaces Using 2D and 3D Plots> |
|
Nate Holt (Nov. 24)
In spatial modeling the predicted surface is often visualized using a 2D heat map. An example will show that 3D surface plots may reveal discrepancies between the predicted surfaces of competing spatial models that cannot be detected with 2D heat maps.
I will also talk about fully funded summer school opportunities in North Carolina through SAMSI, the Statistics and Applied Mathematical Sciences Institute.
schedule
|