Alan Agresti, University of Florida
Dealing with Discreteness: Improved Confidence Intervals for
Proportions, Differences of Proportions, and Odds Ratios
The
standard large-sample confidence intervals for proportions and their
differences used in introductory statistics courses have poor
performance, the actual confidence level possibly being much lower
than the nominal level. `Exact' intervals have limited use because
the discreteness implies very conservative performance. However,
simple adjustments of the large-sample intervals based on adding two
successes and two failures have surprisingly good performance even for
small samples. To illustrate, for n1 = n2 = 10, a nominal 95%
confidence interval for p1 - p2 has actual coverage probability below
.93 for 88% of (p1, p2) pairs in the unit square with the standard
interval but in only 1% with the adjusted interval; the mean distance
between the nominal and actual coverage probabilities is .06 for the
standard interval but .01 for the adjusted one. In teaching with
these adjusted confidence intervals, one can bypass awkward sample
size guidelines and use the same formulas for small and large samples.
Similar adjustments (and related Bayesian methods) work well in other
discrete problems, such as confidence intervals for Poisson means and
for odds ratios.