Confidence intervals for the mean

Since the mean of a sample is just close to the actual expected value μ, but also with a very large sample size an exact equality to the expected value can not be guaranteed. It is an essential task to quantify the error between the estimation and the real value of the quantity.

Theoretically, large deviations may still occur for large sample size, but however the probability for this case is for a large random sample very low. The basic idea of the confidence intervals is to provide a process for calculating an interval, the confidence level, which contains the value to be estimated with a very high probability.

In practice as narrow as possible confidence intervals (high precision) to a very high confidence level (big security) are in search. For a fixed sample size, however, an increase of the confidence level leads to a widening of the confidence interval. The only way to increase the confidence level without widening the interval, is an increase of the sample size.

Especially popular are confidence intervals under normal distribution assumption: Here it is assumed that the feature follows a normal distribution whose expected value is to be estimated. The variance can either be known or estimated from the sample as well. In both cases, relatively simple formulas for the confidence intervals arise. The importance of the confidence interval under normal distribution assumption is that μ is estimated by the mean of the sample, and this is normally distributed also with other approximate distribution assumptions by the central limit theorem. Therefore, the calculation rules under normal distribution can be used under different distributional assumptions. In this case they provide confidence intervals that approximately have the predetermined confidence level for large sample sizes n.

Exact explanation of the confidence level

The random variables X1,...,Xn may describe the observed feature in the experiment. Because the same experiment is repeatet and the results shall not affect each other these random variables are independent and identically distributed. Based on this a lower limit U(X1,...,Xn) and an upper limit Grenze O(X1,...,Xn) will be calculated such U(X1,...,Xn)≤μ≤O(X1,...,Xn) will occur with a probability of at least 1-α. The the interval [U(X1,...,Xn),O(X1,...,Xn)] is called confidence intervall for μ at the confidence level 1-α.

One option to fulfill the condition of confidence intervals for level 1-α is to choose U(X1,...,Xn)and O(X1,...,Xn) in such a way that the events μ<U(X1,...,Xn) and μ>O(X1,...,Xn) each occur only with a probability of no more than α/2. In this way we get symmetrical confidence intervals.

In practice, the calculation rules, which describe the functions U and O, of course are not applied on random variables, but on concrete realizationsx1,...,xn). An interpretation of the confidence level for a single confidence interval calculated this way is not possible because after drawing the sample neither the values x1,...,xn nor the value to be estimated μ are random. But if the method for calculating the confidence interval is repeatedly applied on newly drawn samples, in the long term, the proportion of samples for which the calculated confidence interval contains μ will be at least 1-α.

Function of the interactive figure

A normal distributed random sample using the parameters below is generated. The associated confidence interval for the mean value is calculated and displayed. The parameter α indicates the level and thereby is regulating the width of the interval. If the mean is within the confidence interval (this means if estimattion was correct), the interval is shown in blue, otherwise red. From the selection box the type of the interval can be choosen: In once case the variance is known and is incorporated in the calculation. For the other interval the variance is estimated from the sample.

Generating a sample
Expected value μ=
Varianz σ2=
Zufallszahlen n=
 
Type of the confidence interval
 
Confidence interval for level α
Level α=
Mean of the sample m=
Variance of the sample σ02=
δ=
Confidence interval [m-δ;m+δ]=