Select Page

This lab will introduce you to calculating confidence intervals using a set of data provided.As discussed in lecture, confidence intervals are often calculated as a way to either assess how representative a sample is of a greater population, or a way to compare datasets. The standard layout of confidence interval calculations is: Mean value ± critical value * standard error Remember though that you need to decide on what parameters you need to use for each of those parts of the formula.As discussed in class, this depends on whether the population standard deviation is known, the size of your sample, and whether the data is proportional (%) or not. Demonstration Data: Atmospheric CO2 has been strongly linked with rising global temperatures.This linkage has become increasingly apparent over the last few decades.In spite of efforts to curb the production of greenhouse gases, such as the Kyotyo Protocol, CO2 seems to be on the rise.To assess how much it has increased, you will be working CO2 concentrations (in parts per million [ppm]) measured by the National Oceanic and Atmospheric Adminstration (NOAA) since 1959.This data was collected in Hawaii.Unfortunately, the sensors used to measure the data are expensive and there are challenges with deploying measurement equipment in remote locations, especially where research facilities are lacking funding. We can use confidence intervals for the available data to provide an assessment of how representative the data are of a greater area (i.e. the region of the south Pacific Ocean).This lab will walk you through how to calculate (95%) confidence intervals.In other words, we will be 95% confident that the true mean for the entire south Pacific region will be within the confidence intervals that you calculate.If interested, the source of the data can be found here:ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_annmean_mlo.txt You will then apply these steps to the data from Lab 1 to evaluate how representative the temperature data in Inuvik is of the entire region.Finding Confidence Intervals with R
Data
Suppose we’ve collected a random sample of 10 recently graduated students and asked them
what their annual salary is. Imagine that this is the data we see:
> x
 44617 7066 17594 2726 1178 18898 5033 37151 4514 4000
Goal: Estimate the mean salary of all recently graduated students. Find a 90% and a 95%
confidence interval for the mean.
Setting 1: Assume that incomes are normally distributed with unknown mean and SD = \$15,000.
A (1 – alpha)100% CI is
Xbar +- z(alpha/2) * sigma/sqrt(n)
We know n = 10, and are given sigma = 15000.
a) 90% CI.
This means alpha = .10 We can get z(alpha/2) = z(0.05) from R:
> qnorm(.95)
 1.644854
OR
> qnorm(.05)
 -1.644854
And the sample average is just:
> mean(x)
 14277.7
So our margin of error is
> me me
 7798.177
The lower and upper bounds are:
> mean(x) – me
 6479.523
> mean(x) + me
 22075.88
So our 90% CI is (\$6479, \$22076.)
b. For a 95% CI, alpha = .05. All of the steps are the same, except we replace z(.05) with
z(.025)
> me me
 9296.925
> mean(x) – me
 4980.775
> mean(x) + me
 23574.63
The new interval, (9296, 23574) is wider, but we are more confident that it contains the true
mean.
Setting II: Same problem, only now we do not know the value for the SD. Therefore, we must
estimate it from the data:
> sd(x)
 15345.95
Now a (1-alpha)100% CI looks like
Xbar +- t(alpha/2, df) * s/sqrt(n)
We just calculated s = 15345 and n = 10 still. Xbar is still 14277.
1. 90% CI
Alpha = .10.
All we need is the t-value:
Because the degrees of freedom are n-1 = 10-1 = 9:
> qt(.95,9)
 1.833113
> me me
 8895.76
> mean(x) – me
 5381.94
> mean(x) + me
 23173.46
So the 90% CI is: (8896,23173). Note that this is wider than the last 90% CI.
2. 95% CI. Now alpha = .05.
> me me
 10977.83
> mean(x) – me
 3299.868
> mean(x) + me
 25255.53
(3300,25255)
Note that the lower end is getting dangerously close to 0! Note that this is the widest interval
yet.
Setting III:
Now we no longer assume the data are normal. Note that a look at the histogram and the qqnorm
plot show that this wasn’t such a great assumption to begin with:
So at best, the confidence intervals from above are approximate. The approximation, however,
might not be very good.
A bootstrap interval might be helpful. Here are the steps involved.
1. From our sample of size 10, draw a new sample, WITH replacement, of size 10.
2. Calculate the sample average, called the bootstrap estimate.
3. Store it.
4. Repeat steps 1-3 many times. (We’ll do 1000).
5. For a 90% CI, we will use the 5% sample quantile as the lower bound, and the 95% sample
quantile as the upper bound. (Because alpha = 10%, so alpha/2 = 5%. So chop off that top and
bottom 5% of the observations.)
Here’s the R-code:
> bstrap for (i in 1:1000){
+ # First take the sample
+ bsample #upper bound
> quantile(bstrap,.95)
95%
21906.49
>
We use the same output to get the 95% confidence interval:
> #lower bound for 95% CI is the 2.5th quantile:
> quantile(bstrap,.025)
2.5%
6357.615
> quantile(bstrap,.975)
97.5%
23736.75
So the 90% CI is (7414,21906)
and the 95% is (6358,23737).
Note: this method of using the sample quantiles to find the
bootstrap confidence interval is called the Percentile Method.
There are other methods that might be more suitable for some
situations.
This code could be made much more streamlined:
> bstrap for (i in 1:1000){
+ bstrap
attachment

#### Why Choose Us

• 100% non-plagiarized Papers
• Affordable Prices
• Any Paper, Urgency, and Subject
• Will complete your papers in 6 hours
• On-time Delivery
• Money-back and Privacy guarantees
• Unlimited Amendments upon request
• Satisfaction guarantee

#### How it Works

• Click on the “Place Order” tab at the top menu or “Order Now” icon at the bottom and a new page will appear with an order form to be filled.
• Fill in your paper’s requirements in the "PAPER DETAILS" section.