Alan Agresti, University of Florida

Dealing with Discreteness: Improved Confidence Intervals for Proportions, Differences of Proportions, and Odds Ratios

The standard large-sample confidence intervals for proportions and their differences used in introductory statistics courses have poor performance, the actual confidence level possibly being much lower than the nominal level. `Exact' intervals have limited use because the discreteness implies very conservative performance. However, simple adjustments of the large-sample intervals based on adding two successes and two failures have surprisingly good performance even for small samples. To illustrate, for n1 = n2 = 10, a nominal 95% confidence interval for p1 - p2 has actual coverage probability below .93 for 88% of (p1, p2) pairs in the unit square with the standard interval but in only 1% with the adjusted interval; the mean distance between the nominal and actual coverage probabilities is .06 for the standard interval but .01 for the adjusted one. In teaching with these adjusted confidence intervals, one can bypass awkward sample size guidelines and use the same formulas for small and large samples. Similar adjustments (and related Bayesian methods) work well in other discrete problems, such as confidence intervals for Poisson means and for odds ratios.