Confidence intervals for the mean

Since the mean of a sample is just close to the actual expected value μ, but also with a very large sample size an exact equality to the expected value can not be guaranteed. It is an essential task to quantify the error between the estimation and the real value of the quantity.

Theoretically, large deviations may still occur for large sample size, but however the probability for this case is for a large random sample very low. The basic idea of the confidence intervals is to provide a process for calculating an interval, the confidence level, which contains the value to be estimated with a very high probability.

In practice as narrow as possible confidence intervals (high precision) to a very high confidence level (big security) are in search. For a fixed sample size, however, an increase of the confidence level leads to a widening of the confidence interval. The only way to increase the confidence level without widening the interval, is an increase of the sample size.

Especially popular are confidence intervals under normal distribution assumption: Here it is assumed that the feature follows a normal distribution whose expected value is to be estimated. The variance can either be known or estimated from the sample as well. In both cases, relatively simple formulas for the confidence intervals arise. The importance of the confidence interval under normal distribution assumption is that μ is estimated by the mean of the sample, and this is normally distributed also with other approximate distribution assumptions by the central limit theorem. Therefore, the calculation rules under normal distribution can be used under different distributional assumptions. In this case they provide confidence intervals that approximately have the predetermined confidence level for large sample sizes n.

Exact explanation of the confidence level

The random variables X₁,...,X_n may describe the observed feature in the experiment. Because the same experiment is repeatet and the results shall not affect each other these random variables are independent and identically distributed. Based on this a lower limit U(X₁,...,X_n) and an upper limit Grenze O(X₁,...,X_n) will be calculated such U(X₁,...,X_n)≤μ≤O(X₁,...,X_n) will occur with a probability of at least 1-α. The the interval [U(X₁,...,X_n),O(X₁,...,X_n)] is called confidence intervall for μ at the confidence level 1-α.

One option to fulfill the condition of confidence intervals for level 1-α is to choose U(X₁,...,X_n)and O(X₁,...,X_n) in such a way that the events μ<U(X₁,...,X_n) and μ>O(X₁,...,X_n) each occur only with a probability of no more than α/2. In this way we get symmetrical confidence intervals.

In practice, the calculation rules, which describe the functions U and O, of course are not applied on random variables, but on concrete realizationsx₁,...,x_n). An interpretation of the confidence level for a single confidence interval calculated this way is not possible because after drawing the sample neither the values x₁,...,x_n nor the value to be estimated μ are random. But if the method for calculating the confidence interval is repeatedly applied on newly drawn samples, in the long term, the proportion of samples for which the calculated confidence interval contains μ will be at least 1-α.

Function of the interactive figure

A normal distributed random sample using the parameters below is generated. The associated confidence interval for the mean value is calculated and displayed. The parameter α indicates the level and thereby is regulating the width of the interval. If the mean is within the confidence interval (this means if estimattion was correct), the interval is shown in blue, otherwise red. From the selection box the type of the interval can be choosen: In once case the variance is known and is incorporated in the calculation. For the other interval the variance is estimated from the sample.

Generating a sample
Expected value μ=
Varianz σ²=
Zufallszahlen n=


Type of the confidence interval
Variance:

Confidence interval for level α
Level α=
Mean of the sample m=
Variance of the sample σ₀²=

δ=
*Confidence interval [m-δ;m+δ]=*

Name	Purpose	Lifetime	Type	Provider
_pk_id	Used to store a few details about the user such as the unique visitor ID.	13 months	HTML	Matomo
_pk_ref	Used to store the attribution information, the referrer initially used to visit the website.	6 months	HTML	Matomo
_pk_ses	Short lived cookie used to temporarily store data for the visit.	30 minutes	HTML	Matomo
_pk_cvar	Short lived cookie used to temporarily store data for the visit.	30 minutes	HTML	Matomo
_pk_hsr	Short lived cookie used to temporarily store data for the visit.	30 minutes	HTML	Matomo

Confidence intervals for the mean

Exact explanation of the confidence level

Function of the interactive figure

Info

Portals

Weather & Webcam

Social Media

Content

Content

Content

Confidence intervals for the mean