*38*

A confidence interval is a range of values that is likely to contain a population parameter with a certain level of confidence.

It is calculated using the following general formula:

**Confidence Interval**Â = (point estimate)Â +/-Â (critical value)*(standard error)

This formula creates an interval with a lower bound and an upper bound, which likely contains a population parameter with a certain level of confidence:

**Confidence IntervalÂ **Â = [lower bound, upper bound]

This tutorial explains how to calculate the following confidence intervals in R:

**1.** Confidence Interval for a Mean

**2.** Confidence Interval for a Difference in Means

**3.** Confidence Interval for a Proportion

**4.** Confidence Interval for a Difference in Proportions

Letâ€™s jump in!

**Example 1: Confidence Interval for a Mean**

We use the following formula to calculate a confidence interval for a mean:

**Confidence Interval = xÂ +/-Â t _{n-1, 1-Î±/2}*(s/âˆšn)**

where:

**x:Â**sample mean**t:**the t-critical value**s:Â**sample standard deviation**n:Â**sample size

**Example:Â **Suppose we collect a random sample of turtles with the following information:

- Sample sizeÂ
**n = 25** - Sample mean weightÂ
**xÂ = 300** - Sample standard deviationÂ
**s = 18.5**

The following code shows how to calculate a 95% confidence interval for the true population mean weight of turtles:

#input sample size, sample mean, and sample standard deviation n #calculate margin of error margin #calculate lower and upper bounds of confidence interval low

The 95% confidence interval for the true population mean weight of turtles is **[292.36, 307.64]**.

**Example 2: Confidence Interval for a Difference in Means**

We use the following formula to calculate a confidence interval for a difference in population means:

**Confidence interval**Â = (x_{1}â€“x_{2}) +/- t*âˆš((s_{p}^{2}/n_{1}) + (s_{p}^{2}/n_{2}))

where:

- x
_{1},Â x_{2}: sample 1 mean, sample 2 mean - t: the t-critical value based on the confidence level and (n
_{1}+n_{2}-2) degrees of freedom - s
_{p}^{2}: pooled variance, calculated as ((n_{1}-1)s_{1}^{2}Â +Â (n_{2}-1)s_{2}^{2}) / (n_{1}+n_{2}-2) - t: the t-critical value
- n
_{1}, n_{2}: sample 1 size, sample 2 size

**Example: **Suppose we want to estimate the difference in mean weight between two different species of turtles, so we go out and gather a random sample of 15 turtles from each population. Here is the summary data for each sample:

**Sample 1:**

- x
_{1}Â = 310 - s
_{1}Â = 18.5 - n
_{1}Â = 15

**Sample 2:**

- x
_{2}Â = 300 - s
_{2}Â = 16.4 - n
_{2}Â = 15

The following code shows how to calculate a 95% confidence interval for the true difference in population means:

#input sample size, sample mean, and sample standard deviation n1 #calculate pooled variance sp = ((n1-1)*s1^2 + (n2-1)*s2^2) / (n1+n2-2) #calculate margin of error margin #calculate lower and upper bounds of confidence interval low

The 95% confidence interval for the true difference in population means isÂ **[-3.06, 23.06]**.

**Example 3: Confidence Interval for a Proportion**

We use the following formula to calculate a confidence interval for a proportion:

**Confidence Interval = p****Â +/-Â z*(âˆšp(1-p) / n)**

where:

**p:Â**sample proportion**z:Â**the chosen z-value**n:Â**sample size

**Example: **Suppose we want to estimate the proportion of residents in a county that are in favor of a certain law. We select a random sample of 100 residents and ask them about their stance on the law. Here are the results:

- Sample sizeÂ
**n = 100** - Proportion in favor of lawÂ
**p = 0.56**

The following code shows how to calculate a 95% confidence interval for the true proportion of residents in the entire county who are in favor of the law:

#input sample size and sample proportion n #calculate margin of error margin #calculate lower and upper bounds of confidence interval low

The 95% confidence interval for the true proportion of residents in the entire county who are in favor of the law isÂ **[.463, .657]**.

**Example 4: Confidence Interval for a Difference in Proportions**

We use the following formula to calculate a confidence interval for a difference in proportions:

**Confidence interval = (p _{1}â€“p_{2})Â +/-Â z*âˆš(p_{1}(1-p_{1})/n_{1Â }+ p_{2}(1-p_{2})/n_{2})**

where:

- p
_{1}, p_{2}: sample 1 proportion, sample 2 proportion - z: the z-critical value based on the confidence level
- n
_{1}, n_{2}: sample 1 size, sample 2 size

**Example: **Suppose we want to estimate the difference in the proportion of residents who support a certain law in county A compared to the proportion who support the law in county B. Here is the summary data for each sample:

**Sample 1:**

- n
_{1}Â = 100 - p
_{1}Â = 0.62 (i.e. 62 out of 100 residents support the law)

**Sample 2:**

- n
_{2}Â = 100 - p
_{2}Â = 0.46 (i.e. 46 our of 100 residents support the law)

The following code shows how to calculate a 95% confidence interval for the true difference in proportion of residents who support the law between the counties:

#input sample sizes and sample proportions n1 #calculate margin of error margin #calculate lower and upper bounds of confidence interval low

The 95% confidence interval for the true difference in proportion of residents who support the law between the counties isÂ **[.024, .296]**.

*You can find more R tutorials here.*