Basic Inference, Vitamins and Tooth Growth

Author: Arnie Larson

Date: 12/21/2015

Overview

The goal of this exercise is to explore and perform basic inference on the ToothGrowth data set availble in R. To investigate the effect of treatment factors on tooth growth response, two sample t tests and 95% confidence intervals are used to determine when differences are significant.

The Data

The response is the length of odontoblasts (teeth) in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods (orange juice or ascorbic acid). Making a boxplot of the response variable versus the independent variables shows that the response for a given treatment (dosage + delivery method) is roughly symmetric about the 50th percentile.

library(ggplot2)
data("ToothGrowth")
ToothGrowth$dose<-as.factor(ToothGrowth$dose)
q<-ggplot(ToothGrowth,aes(x=dose, y=len, fill=supp)) +
    labs(title="ToothGrowth Response vs. Dose ") +
    xlab("Dose (mg/day)") + ylab("Growth Length (mm)")
q+geom_boxplot()

From the plot above, it appears likely that there are statistically different effects for the different treatments. The combination of dosage and delivery method generates 6 independent treatments, each having a sample size of 10. The effect of a treatment is measured by taking the mean of the response. Since we don’t know the underlying variance, comparing sample means relies on confidence intervals using the t-statistic.

First I subset the response by treatment to make doing t-tests simpler.

vc05<-subset(ToothGrowth, supp=="VC" & dose=="0.5",len)$len
oj05<-subset(ToothGrowth, supp=="OJ" & dose=="0.5",len)$len
vc10<-subset(ToothGrowth, supp=="VC" & dose=="1",len)$len
oj10<-subset(ToothGrowth, supp=="OJ" & dose=="1",len)$len
vc20<-subset(ToothGrowth, supp=="VC" & dose=="2",len)$len
oj20<-subset(ToothGrowth, supp=="OJ" & dose=="2",len)$len

Some Inference Questions

Is the effect of using Orange Juice vs Vitamin C at a dose of 0.5 mg different? What is the 95% CI for this effect?

Taking the mean of the response estimates the effect. For this case the relative effect is:

mean(oj05)-mean(vc05)

## [1] 5.25

This difference is significant at the 95% level if the 95% CI for the effect is entirely above 0. If we assume that the response variances are the same, then to get the 95% confidence interval we use the pooled variance and the t-statistic. At the 95% level, the effect is statistically significant.

mean(oj05)-mean(vc05) +c(-1,1)*qt(.975,20-2)*sqrt((sd(vc05)^2+sd(oj05)^2)/2)*sqrt(1/5)

## [1] 1.770262 8.729738

Note, we can arrive at the same result using the t.test command with equal variance.

t.test(oj05, vc05, var.equal=T)

## 
##  Two Sample t-test
## 
## data:  oj05 and vc05
## t = 3.1697, df = 18, p-value = 0.005304
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.770262 8.729738
## sample estimates:
## mean of x mean of y 
##     13.23      7.98

For the sake of completeness, if we can not assume that the variances are the same, then we have to use a different formula to create a 95% CI. This interval is a little wider than the previous one, but still shows a significant effect.

df <- function(sdx, sdy, nx, ny) {
  num<-(sdx^2/nx + sdy^2/ny)^2
  den<-((sdx^2/nx)^2)/(nx-1) + ((sdy^2/ny)^2)/(ny-1)
  return(num/den)  
}
mean(oj05)-mean(vc05) +c(-1,1)*qt(.975,df(sd(oj05),sd(vc05),10,10)) *    sqrt((sd(vc05)^2)/10+(sd(oj05)^2)/10)

## [1] 1.719057 8.780943

This can also be recreated from the t.test command without equal variance, (which is also the default behavior).

t.test(oj05, vc05, var.equal=F)

## 
##  Welch Two Sample t-test
## 
## data:  oj05 and vc05
## t = 3.1697, df = 14.969, p-value = 0.006359
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  1.719057 8.780943
## sample estimates:
## mean of x mean of y 
##     13.23      7.98

I will assume that tooth growth variance is primarily due to the underlying population variance, and will use the pooled variance in my CI estimates from here on. (Note, I make no attampt to thoroughly investigate or justify this assumption.)

Is the effect of using Orange Juice at 1.0 mg greater than 0.5 mg? What is the 95% CI for this effect?

The relative effect is:

mean(oj10)-mean(oj05)

## [1] 9.47

And this effect apperas to be significant at the 95% level

mean(oj10)-mean(oj05) +c(-1,1)*qt(.975,20-2)*sqrt((sd(oj10)^2+sd(oj05)^2)/2)*sqrt(1/5)

## [1]  5.529186 13.410814

Is the effect of using Orange Juice at 2.0 mg greater than that at 1.0 mg? What is the 95% CI of this effect?

The relative effect is:

mean(oj20)-mean(oj10)

## [1] 3.36

And this effect also apperas to be significant at the 95% level

mean(oj20)-mean(oj10) +c(-1,1)*qt(.975,20-2)*sqrt((sd(oj20)^2+sd(oj10)^2)/2)*sqrt(1/5)

## [1] 0.2194983 6.5005017

So we have found that increasing the dosage of Vitamin C via orange juice has a statstically significant increaseing effect on tooth growth.

Is the effect of using Orange Juice at 1.0 mg greater than that from Asorbic Acid also at 1.0 mg? What is the 95% CI of this effect?

The relative effect is:

mean(oj10)-mean(vc10)

## [1] 5.93

This effect also apperas to be significant at the 95% level

mean(oj10)-mean(vc10) +c(-1,1)*qt(.975,20-2)*sqrt((sd(oj10)^2+sd(vc10)^2)/2)*sqrt(1/5)

## [1] 2.840692 9.019308

Is the effect of using OJ at 2.0 mg greater than that from Asorbic Acid at 2.0 mg? What is the 95% CI of this effect?

The relative effect is:

mean(oj20)-mean(vc20)

## [1] -0.08

There is no statistical difference in these effects.

mean(oj20)-mean(vc20) +c(-1,1)*qt(.975,20-2)*sqrt((sd(oj20)^2+sd(vc20)^2)/2)*sqrt(1/5)

## [1] -3.722999  3.562999

In other words, we fail to reject the null hypothesis, that there is no difference in the effect.

We have found that for a dose of 0.5 mg and 1.0 mg, there is a greater effect when the dose is delivered via Orange Juice than when the dose is delivered via Acorbic Acid. At 2.0 mg we have not found any difference in the effect.

Conclusion

In this analysis, the response of tooth growth in guinea pigs due to various treatments was investigated using the t-statistic to create 95% confidence intervals to summaraize effects. It was found that there are significant response differences between dosages and deliver methods. It was found that increasing the dosage of Vitamin C delivered by Orange Juice causes an increasing effect for each of the three dosages tested. It was also found that for a given dosage, the response from Oragne Juice was greater than that from Asorbic Acid at the two lower dosages, while no difference was observed at the higher dosage.