By: Umarporn Charusombat & Andy Sabalowsky
All of the intervals described below are commonly used to determine if sets of data are consistent with previous or present sets of data. For instance, when compared to background concentrations of a specific compound in the region of interest, is it consistent to say, based upon the newest data collected, that the site is still not contaminated? The intervals are determined by statistical methods which can be onetailed or twotailed, and which can be parametric or nonparametric.
This web page focuses only on parametric methods, and specifically on the assumption that the data are normally distributed or lognormally distributed (meaning the data are normally distributed after the "logs" of the data are used. These methods are mostly just expansions upon hypothesis testing, so if you are not familiar with hypothesis testing, it is a good idea to go here first.
A Confidence interval is a range of
values which span from the Lower Confidence Limit to the
Upper Confidence Limit. We expect this range to encompass
the population parameter of interest, such as the
population mean, with a degree of certainty which we
specify up front. The degree of accuracy we expect from
the determined confidence interval is 1alpha, where we
pick alpha to be an acceptable risk of being wrong. For
instance, we are willing to take a 5% chance of being
wrong (alpha), so we expect that the confidence interval
which we calculate will have a 95% chance of actually
containing the population mean value between the lower
and upper bounds, or confidence limits. For instance,
with a 95% level of confidence (or alpha of 0.05), if one
collects 100 different sets of samples, each consisting
of 10 values; each sample set will have its own mean
(Xbar) and standard deviation (s), and thus it's own
confidence interval described by the equations below. Out
of these 100 samples, 95 of them will have confidence
intervals which actually contains the population mean
(µ), while 5% of them will have a range which the
population mean falls outside of. For cases where the population standard deviation (sigma) are known (a rarety), the twotailed confidence interval for the population mean (µ) can be determined as follows:
where = lower confidence limit = upper confidence limit
If the sample size (n) is greater than or equal to thirty, the sample standard deviation (s) can be substituted into the equations above where the "sigmas" are, and the zvalues are still appropriate, according to statisticians. This is because, for large sample sizes, s is generally very close to sigma, and the central limit theorem holds true (Walpole and Meyers, 1993).
If the variance of the population is unknown and the sample size (n) is less than 30 (and often when the sample size is greater than 30), the confidence interval can be determined using the tdistribution as in following equation, where t = t value with n1 degrees of freedom for a 1sample test or n_{1} + n_{2}  2 degrees of freedom for a 2sample test:
The benefit of using the tdistribution instead of the zdistribution is that the data are only assumed to have come from a normallly distributerd population for the tdistribution, whereas the use of zvalues requires that the data themselves are normally distributed.
There are 1tailed intervals as well, and are actually more common with respect to compliancetype situations, where we are interested in substantiating that the site of interest is not contaminated (or that µ is less than or equal to a specified limit). It is actually more appropriate to use tolerance or prediction intervals for compliance monitoring (to be discussed below), but nevertheless, the onetailed confidence intervals will generally take the form:

Let's look at a simple problem to demonstrate the meaning of a confidence interval....
Suppose we are in
an orchard which has twenty apple trees, and we place ten
buckets underneath each tree in order to catch falling
apples. Let's consider what happens to one tree.From the
ten buckets, we find a sample mean (Xbar: total number
of apples/ten buckets) of 6 apples per bucket, with a
standard deviation (s) of 2.8 apples. If we want to be
95% confident that the mean number of apples per bucket
in the orchard is within a range based upon our one tree,
assuming the collection of apples in each bucket is
normally distributed, we would use a twotailed ttest
corresponding to our alpha, 0.05 (or 1  0.95), and our
degrees of freedom, 9 (or n  1). The twotailed tvalue
for alpha=0.05 and 9 degrees of freedom is 2.262. 
Thus, we get a confidence interval of:
6±(2.262)(2.8/10^{1/2} ) = 6±2.00
So based upon our sample and calculations, we are 95% confident that the mean number of apples per bucket for all the buckets in the orchard is between 4 and 8 . Suppose we used the same level of confidence for all calculations and calculated a confidence interval for each tree in the orchard. Suppose also, that we determined the actual population mean number of apples per bucket (or, the total number of apples collected divided by the total number of buckets, 20 x 10) to be 6.0. Given our degree of confidence which we used to determine a confidence interval for each (with the same number of buckets per tree and same alpha value); 95%, or 19 of our twenty samples (trees) will, in fact, include our population mean (6 apples per bucket), and 5%, or 1 out of our twenty samples, will specify a confidence interval which fails to include the value of our population mean. To put it graphically...
Unlike the confidence interval, which
estimates the range in which a population parameter
falls, the tolerance interval estimates the range which
should contain a certain percentage of each individual
measurement in the population. Because tolerance
intervals are based upon only a sample of the entire
population, we cannot be 100% confident that that
interval will contain the specified proportion. Thus
there are two different proportions associated with the
tolerance interval: a degree of confidence, and a percent
coverage. For instance, we may be 95% confident that 90%
of the population will fall within the range specified by
the tolerance interval. s = standard deviation K = the factor to adjust the width of the interval, which can be found in tables such as the one provided below or calculated by the below equations. Recall that we may be interested in a 1tailed interval, which would simply have a "+" or a "" in the above equation, and we would use 1tailed values instead of 2tailed values for K.
= 1tailed tolerance interval = 2tailed tolerance interval Where here, 1alpha represents the percent coverage, and not the level of confidence. 
n  One Sided K  Two Sided K 
3  9.916  7.655 
4  6.370  5.145 
5  5.079  4.202 
6  4.414  3.707 
7  4.007  3.399 
8  3.732  3.188 
9  3.532  3.031 
10  3.379  2.911 
11  3.259  2.815 
12  3.162  2.736 
13  3.081  2.670 
14  3.012  2.614 
15  2.954  2.566 
16  2.903  2.536 
17  2.858  2.486 
18  2.819  2.543 
19  2.784  2.423 
20  2.752  2.396 
21  2.371  
22  2.350  
23  2.329  
24  2.309  
25  2.631  2.292 
30  2.549  2.220 
35  2.490  2.166 
40  2.445  2.126 
45  2.408  2.092 
50  2.379  2.065 
55  2.354  2.036 
60  2.333  2.017 
65  2.315  2.000 
70  2.299  1.986 
75  2.285  1.927 
80  2.272  
85  2.261  
90  2.251  
95  2.241  
100  2.233  1.924 
125  1.891  
150  2.175  1.868 
175  1.850  
200  2.143  1.836 
225  1.824  
250  2.121  1.814 
275  1.806  
300  2.106  1.799 
400  2.084  1.777 
500  2.070  1.763 
600  2.060  1.752 
700  2.052  1.744 
800  2.046  1.737 
900  2.040  1.732 
1000  2.036  1.727 
infinity  1.960  1.645 
Tolerance Interval Example :
Let's look at a simple problem to demonstrate the meaning of a tolerance interval...
Suppose we climb up onto a platform and drop a
handful of marbles (of which there are twenty) and
measure how far from a specified point on the ground each
one lands. As it turns out, the average distance (Xbar)
from the center is 8.5 inches, with a standard deviation
of 3.7 inches. Suppose now, we want to get a bucket of a size which will catch 99% of the marbles ever dropped from our special marbledropping platform, with a confidence level of 95%. This means we have a gamma equal to 1  0.95, or 0.05, and an alpha of 1  0.99, or 0.01. Be careful here! described as such (notation used by Walpole and Meyers, 1993), alpha represents the proportion of the population, NOT the degree of confidence, as in the confidence interval calculations.... Always make sure you are clean about what the variable names mean when you obtain constants from tables! 
Since we are concerned with catching all marbles with a radius less than or equal to the radius of our bucket and not simply interested in figuring out what 99% of the radii are, we will use a onetailed, instead of twotailed, tolerance interval. Thus, for a sample size (n) of 20, gamma of 0.05, and an alpha of 0.01, the onetailed Kvalue is 3.295; we get a tolerance interval of: 8.5 + (3.295)(3.7) = 8.5 + 12.2 This is our 95% tolerance interval for 99% of the marbles falling within 8.5 + 12.2, or 20.7, inches from the center. This means that, if we drop 100 marbles from our platform 100 times, 95 of those times we expect to catch at least 99 marbles, and 5 of those times we expect to catch less than 99 marbles if we use a bucket with a radius of 20.7 inches. That's an awfully big bucket, but think about how confidently we can catch many marbles! 
While confidence and tolerance intervals estimate
present population characteristics, the prediction
interval estimates what future values will be, based upon
present or past background samples taken. As few as one
future value can be estimated, and as few as four
background values can be used to determine prediction
limits (the minimum recommended in order to determine a
standard deviation). The United States EPA recommends
using 8 or more samples for constructing prediction
intervals (EPA/530R93003). The prediction interval
attempts to determine what future values will be with a
degree of confidence, just as in the confidence and
tolerance intervals. For example, we may attempt to
predict that the next set of samples will fall within a
determined range, with 99% confidence. To calculate
prediction limits, we first must know a sample mean and
standard deviation, based upon background data of sample
size, n. Once we decide how many sampling periods and how
many samples will be collected per sampling period, we
can determine the prediction interval by using the same
generic equation: Where this time K is determined by the below equation. Recall, again, that we may be interested in a 1tailed interval, which would simply have a "+" or a "" in the above equation, and we would use 1tailed values instead of 2tailed values for K.

At Andy's chicken farm, many chickens live there and they have many colors. Once a day he went to the field and caught them to sell at the market. One day, in his spare time, Andy determined that he caught an average 25 red chickens each week, with a standard deviation of 5 red chickens per week, since he started his business 121 weeks ago. If the false positive rate (alpha) is 0.10 , approximately how many red chickens will he catch in the next 2 weeks? 
Use the following equations for a twosided prediction interval
where t(n1,1alpha/2k) = t(1211,.975). For the number of the future periods is 2 weeks, k=2, at the 90 % confidence. At .975 level, t( 120,.975) value is 1.980. In the next two weeks, Andy will catch his chickens 7 times a week so the m value should be equal to 7. Now we can solve the equation as:
25 ±1.980*(1/121+1/1)^{1/2}*5 that is equal to 15.1 and 34.9
That means we have 90% confidence that in the next two weeks, Andy will catch between 15.1 and 34.9 red chickens each week. Or to think of it another way, if Andy catches the chickens for the next 100 weeks, 90 of those weeks he will catch the number of the red chickens which will fall in this range; And 5 of those weeks he will catch the either less than 15.1 or more than 34.9 red chickens
When are these Intervals Appropriate ?
Typically, since confidence intervals are based upon sample standard deviations, confidence interval calculations require sample sizes of four or more, as recommended by the EPA (EPA/530R93003). Fewer data points result in wider confidence intervals, thus, larger sample sizes are preferred since a narrow interval is more useful. Remember, confidence intervals only apply to parameters, and not individual measurements. Thus, confidence intervals are only useful in estimating what the population parameter, such as the mean, should be; but it does not tell us anything about what any of the individual values in the population range from. 
Tolerance intervals are more applicable in areas such as compliance monitoring, because they tell us what the individual values should be. If the upper limit of a tolerance interval which is calculated from a sample set is higher than the set standard, then there is a high probability (1gamma) that more than (alpha) percent of the measurements are above the standard, and thus, that the sight is in violation. As few as three data points can be used to generate a tolerance interval, but the EPA recommends having at lest eight points for the interval to have any usefulness (EPA/530R93003). 
As the name suggests, the prediction interval is useful in determining what future values should be, based upon present or past data. Prediction intervals are especially powerful because they can predict what a future compliance point should be less than before it is even collected, as opposed to having to wait until the data is collected in order to determine the tolerance interval and then comparing to standards. Another adantage is that as few as one future sample (k=1) can be used in determining the prediction interval, rather than a sample size of 8 or more for confidence or tolerance intervals. Thus, in areas such as groundwater monitoring, where a long period of time must pass, and few data points can be collected, prediction intervals are especially useful. 
Applications
Mostly confidence intervals are used in general statistical analyses to tell us the range the mean of population will fall in. It cannot be used in detection monitoring or comparing to health or environmental standards because the confidence interval cannot give the highest concentration (the value we are often most concerned with), but only the average concentration of a population. Confidence intervals are appropriate, however, for compliance monitoring in groundwater where downgradient samples are being compared to set standards. In other words, is the sample mean greater than or equal to the standard?
The tolerance interval gives us an idea of what range each individual measurement should fall within. Thus, it is especially useful in compliance monitoring when one is concerned with Maximum Contaminant Levels (MCLs). The tolerance interval already takes into account the fact that some values will be high. So if a few values exceed the MCL standard, a site may still not be in violation (because the calculated tolerance interval may still be lower than the MCL). But if too many values are above the MCL, the calculated tolerance interval will extend beyond the acceptable standard.
Prediction intervals tend to be applied in detection monitoring in two main ways. They can be used either to compare compliance wells with background wells, or they can be used for intrawell comparisons of monitoring wells. When comparing compliance wille to a background well, if the compliance wells come from the same, uncontaminated water source, the upper prediction limit should be greater than or equal to the data collected from compliance wells. Compliance data greater than upper prediction limits is indicative of contamination. For intrawell comparisons, a range of values is determined which future values collected from that same well should fall within. Any data collected in the future which does not fall within that specified range is an indication that a once uncontaminated water supply is now contaminated. 
Reference : Walpole, Ronald E. & Raymond H. Meyers, Probability and Statistics for Engineers and Scientists,5th Ed., Prentice Hall, Inc., Englewodd Cliffs, NJ, 1993.
Environmental Protection Agency, Statistical Training Course for GroundWater Monitoring Data Analysis, EPA/530R93003, Office of Solid Waste, Washington, DC, 1992.
Sampling & Monitoring Primer Table of Contents 
Previous Topic 
Next Topic 
Send comments or suggestions to:
Student Authors: Umarporn Charusombat & Andy Sabalowsky
Faculty Advisor: Daniel Gallagher, dang@vt.edu
Copyright © 1997 Daniel Gallagher
Last Modified: 09101997