Jackknife Estimation: In statistics, the jackknife is a resampling technique especially useful for variance and bias estimation. The jackknife pre-dates other common resampling methods such as the bootstrap. The jackknife estimator of a parameter is found by systematically leaving out each observation from a dataset and calculating the estimate and then finding the average of these calculations. This is also known as Leave One Out Cross Validation (LOOCV).
?ChickWeight
summary(ChickWeight)
## weight Time Chick Diet
## Min. : 35.0 Min. : 0.00 13 : 12 1:220
## 1st Qu.: 63.0 1st Qu.: 4.00 9 : 12 2:120
## Median :103.0 Median :10.00 20 : 12 3:120
## Mean :121.8 Mean :10.72 10 : 12 4:118
## 3rd Qu.:163.8 3rd Qu.:16.00 17 : 12
## Max. :373.0 Max. :21.00 19 : 12
## (Other):506
We have unequal variances on the experimental “Diet” groups
We can create a linear model for the entire data set with the code
fit <- lm(weight ~ Time, data = ChickWeight)
fit
##
## Call:
## lm(formula = weight ~ Time, data = ChickWeight)
##
## Coefficients:
## (Intercept) Time
## 27.467 8.803
We could add Time|Chick to control for the intra-subject effect
We can get the R2 value with the code
summary(fit)$adj.r.squared
## [1] 0.7002197
These chicks sure get bigger over time as 70% of the variance in weight is explained by time
GOAL: Implement a jackknife procedure to produce a graphical display (such as a histogram or boxplot) of the R2 value for the model estimating weight as a function of time.
We need to create a loop that eliminates one row at the time, calculates the R2 value for each of the linear models, and stores each r-squre in a new variable that we can use to create a histogram.
x <- 1:578
RSquares <- numeric(length(x))
for (i in seq_along(x)){
Shorter <- ChickWeight[-i, ]
fit <- lm(weight ~ Time, data = Shorter)
RSquares[i] <- summary(fit)$adj.r.squared
}
RSquares
## [1] 0.6996341 0.6997133 0.6998163 0.6999655 0.7001675 0.7003075 0.7004460
## [8] 0.7004454 0.7002713 0.7000463 0.6996109 0.6995228 0.6995819 0.6996768
## [15] 0.6998061 0.6999865 0.7001379 0.7002361 0.7002613 0.7002475 0.7000661
## [22] 0.6997808 0.6994440 0.6993240 0.6996610 0.6995232 0.6997784 0.6999697
## [29] 0.7001379 0.7002589 0.7003269 0.7002475 0.7000537 0.6997808 0.6996302
## [36] 0.6995919 0.6996341 0.6996768 0.6997872 0.6999697 0.7001798 0.7003736
## [43] 0.7005115 0.7008275 0.7005580 0.7004637 0.7007233 0.7011500 0.6996078
## [50] 0.6995642 0.6997308 0.6999665 0.7001528 0.7002241 0.7002024 0.7000962
## [57] 0.6999194 0.6996629 0.6993163 0.6991997 0.6996078 0.6996768 0.6998163
## [64] 0.6999966 0.7001556 0.7002432 0.7002024 0.7001507 0.7001665 0.7003005
## [71] 0.7007233 0.7011500 0.6996078 0.6996768 0.6997964 0.6999822 0.7001350
## [78] 0.7002131 0.7002158 0.7001250 0.7001161 0.6999408 0.6998345 0.6997243
## [85] 0.6996341 0.6996948 0.6998382 0.6999822 0.7001379 0.7003075 0.7003882
## [92] 0.7006303 0.7008342 0.7011336 0.7023479 0.6996341 0.6997133 0.6998163
## [99] 0.6999721 0.7001363 0.7002810 0.7007544 0.7013148 0.7020889 0.7027171
## [106] 0.7038727 0.7046799 0.6996078 0.6995940 0.6997551 0.6999650 0.7001798
## [113] 0.7004570 0.7007778 0.7011813 0.7017362 0.7020941 0.7026285 0.7029155
## [120] 0.6996610 0.6997133 0.6998620 0.7000761 0.7002774 0.7003786 0.7004185
## [127] 0.7001430 0.6999098 0.6998212 0.7000330 0.7004093 0.6996078 0.6996768
## [134] 0.6997872 0.6999650 0.7001939 0.7003614 0.7002865 0.7002859 0.7000661
## [141] 0.6998072 0.6996911 0.6995228 0.6996078 0.6996593 0.6997624 0.6999665
## [148] 0.7002588 0.7007195 0.7012817 0.7021877 0.7032195 0.7038471 0.7044964
## [155] 0.7048293 0.6996078 0.6996768 0.6998498 0.7000303 0.7001775 0.7002687
## [162] 0.7003642 0.7002987 0.7002660 0.6999061 0.6993370 0.6990643 0.6996078
## [169] 0.6996768 0.6997872 0.6999655 0.7002281 0.7006976 0.7014150 0.7022787
## [176] 0.6996078 0.6996096 0.6997361 0.6999972 0.7003620 0.7011359 0.7019021
## [183] 0.6996341 0.6997133 0.6998382 0.6999865 0.7001399 0.7003496 0.7005847
## [190] 0.7009664 0.7012654 0.7015846 0.7019242 0.7018870 0.6995565 0.6994753
## [197] 0.6996610 0.6996593 0.6997784 0.6999650 0.7002588 0.7006348 0.7009551
## [204] 0.7014560 0.7015316 0.7017177 0.7013924 0.7011500 0.6996078 0.6996422
## [211] 0.6997702 0.6999700 0.7002588 0.7005954 0.7010962 0.7014199 0.7018648
## [218] 0.7023452 0.7029213 0.7033581 0.6995819 0.6996948 0.6998498 0.7000978
## [225] 0.7004708 0.7008213 0.7017135 0.7015321 0.7017361 0.7017612 0.7007872
## [232] 0.7005839 0.6996078 0.6997922 0.6998746 0.7000154 0.7001359 0.7002894
## [239] 0.7004161 0.7007499 0.7006901 0.7006444 0.7005754 0.7007192 0.6996610
## [246] 0.6997323 0.6998382 0.6999914 0.7001359 0.7002361 0.7002289 0.7002859
## [253] 0.7003508 0.7002253 0.7003680 0.7004093 0.6996341 0.6997323 0.6998061
## [260] 0.6999966 0.7002481 0.7006976 0.7013143 0.7021429 0.7031630 0.7044441
## [267] 0.7056243 0.7066031 0.6995819 0.6996768 0.6998498 0.7000226 0.7001841
## [274] 0.7002432 0.7002158 0.7000962 0.6999194 0.6996894 0.6993370 0.6990572
## [281] 0.6996341 0.6996593 0.6997964 0.6999966 0.7001415 0.7002133 0.7002010
## [288] 0.7001582 0.6999894 0.6996300 0.6992353 0.6990093 0.6995565 0.6996256
## [295] 0.6998061 0.6999914 0.7001347 0.7002525 0.7003269 0.7004831 0.7003719
## [302] 0.7002253 0.6999257 0.6998536 0.6995565 0.6996256 0.6998061 0.6999914
## [309] 0.7001391 0.7002133 0.7002122 0.7001080 0.6999048 0.6996230 0.6994034
## [316] 0.6990880 0.6995565 0.6996593 0.6998163 0.6999966 0.7001347 0.7002241
## [323] 0.7002038 0.7001371 0.6999009 0.6996811 0.6996358 0.6998345 0.6996341
## [330] 0.6996593 0.6998163 0.6999865 0.7001363 0.7002658 0.7003269 0.7005027
## [337] 0.7003935 0.7005519 0.7008393 0.7014804 0.6996341 0.6997518 0.6998498
## [344] 0.6999914 0.7001363 0.7002411 0.7002538 0.7002475 0.6999804 0.6996343
## [351] 0.6992367 0.6990155 0.6996078 0.6996768 0.6998877 0.7000564 0.7002247
## [358] 0.7002763 0.7003073 0.7001575 0.7001617 0.7002147 0.6999096 0.6997243
## [365] 0.6995565 0.6996948 0.6998620 0.7000154 0.7001513 0.7002137 0.7002003
## [372] 0.7001836 0.7002344 0.7007085 0.7008790 0.7016292 0.6996078 0.6996768
## [379] 0.6998620 0.7000867 0.7002247 0.7003214 0.7003642 0.7002233 0.7004325
## [386] 0.7012040 0.7011609 0.7010050 0.6996078 0.6997518 0.6998746 0.7001094
## [393] 0.7004358 0.7007061 0.7011621 0.7014582 0.7022817 0.7030721 0.7029437
## [400] 0.7026955 0.6995565 0.6996593 0.6998382 0.7000087 0.7001603 0.7002154
## [407] 0.7002122 0.7000981 0.6999239 0.6996589 0.6992777 0.6992427 0.6996078
## [414] 0.6996593 0.6997872 0.6999721 0.7001488 0.7004273 0.7004944 0.7007250
## [421] 0.7005835 0.7003799 0.7004014 0.7003011 0.6996078 0.6996768 0.6998382
## [428] 0.6999966 0.7001603 0.7002164 0.7002238 0.7001158 0.6999041 0.6996983
## [435] 0.6996559 0.6993816 0.6996341 0.6996948 0.6998382 0.7000226 0.7001350
## [442] 0.7002164 0.7002152 0.7001662 0.6999804 0.6996135 0.6992665 0.6991171
## [449] 0.6996078 0.6997922 0.6999012 0.7000303 0.7001775 0.7002254 0.7002625
## [456] 0.7001828 0.7000749 0.7001907 0.7000167 0.7002132 0.6996341 0.6997133
## [463] 0.6999012 0.7000867 0.7001913 0.7002432 0.7002705 0.7001204 0.6999425
## [470] 0.6998212 0.6996109 0.6995454 0.6996341 0.6996768 0.6998620 0.7000761
## [477] 0.7001913 0.7002550 0.7003177 0.7001250 0.6999612 0.6997174 0.6994619
## [484] 0.6992295 0.6996341 0.6997922 0.6999449 0.7002352 0.7005876 0.7006845
## [491] 0.7007130 0.7002465 0.6999194 0.6996701 0.6996109 0.6996404 0.6996341
## [498] 0.6997133 0.6998877 0.7000978 0.7001913 0.7002195 0.7002289 0.7002475
## [505] 0.7003508 0.7007085 0.6996078 0.6996948 0.6998382 0.7000226 0.7001603
## [512] 0.7002172 0.7002021 0.7002134 0.7003101 0.6999871 0.6996500 0.6997431
## [519] 0.6995819 0.6997323 0.6998498 0.7000564 0.7001775 0.7002254 0.7002090
## [526] 0.7001080 0.6999562 0.6996160 0.6992473 0.6990504 0.6996078 0.6997518
## [533] 0.6999012 0.7000303 0.7001713 0.7002380 0.7002246 0.7001048 0.6999989
## [540] 0.6998072 0.6994300 0.6995228 0.6995565 0.6996948 0.6998498 0.7000385
## [547] 0.7001989 0.7002488 0.7002625 0.7001077 0.7001779 0.7001671 0.7002548
## [554] 0.7002480 0.6995819 0.6997518 0.6998746 0.7000867 0.7002343 0.7002687
## [561] 0.7002479 0.7000981 0.6999048 0.6996391 0.6992410 0.6990569 0.6996078
## [568] 0.6997717 0.6999153 0.7000761 0.7002071 0.7002334 0.7002705 0.7001305
## [575] 0.6999691 0.6997174 0.6993933 0.6990506
hist(RSquares)
Why is it right skewed? Are some measures outliers and removing them makes the R square larger? Will these “outlier” measures be all from the same chick?
Dataset <- ChickWeight
Dataset$RSquares <- RSquares
library(ggplot2)
Let’s visualize what data points have a particular strong influence in the R square.
ggplot(Dataset, aes(Time, weight)) +
geom_point(aes(colour = RSquares)) +
geom_smooth(method="lm", formula=y~x)
As expected, removing dots that are farther from the line result in the R Square getting larger (clear dots farther from the line - dark dots closer to the line).
Are the chicks in a specific diet particularly likely to be outliers?
ggplot(Dataset, aes(Time, weight)) +
geom_point(aes(colour = RSquares)) +
geom_smooth(method="lm", formula=y~x) +
facet_grid(~Diet)
There is a chick in Diet 2 which is not gaining enough weight when compared to his/her peers. This chick might not look as such a big outlier in Diet 1, where there are other chicks not gaining much weight.
If you have questions, suggestions, and/or comments about the analysis, please contact Sara Incera at sara.incera@eku.edu