\part{Informal Uses of Resampling Methods} \section{Bootstrap Assessments of Sampling Variability} \begin{fmpage}{36pc} \exhead{1} The following takes a with replacement sample of the rows of \texttt{Pima.tr2}. <>= rows <- sample(1:dim(Pima.tr2)[1], replace=TRUE) densityplot(~ bmi, groups=type, data=Pima.tr2[rows, ], scales=list(relation="free"), xlab="Measure") @ % Repeat, but using \texttt{anymiss} as the grouping factor, and with different panels for the two levels of \texttt{type}. Repeat for several different bootstrap samples. Are there differences between levels of \texttt{anymiss} that seem consistent over repeated bootstrap samples? \end{fmpage} \begin{fmpage}{36pc} \exhead{2} The following compares density plots, for several of the variables in the data frame \texttt{Pima.tr2}, between rows that had one or more missing values and those that had no missing values. <>= missIND <- complete.cases(Pima.tr2) Pima.tr2$anymiss <- c("miss","nomiss")[missIND+1] library(lattice) stripplot(anymiss ~ npreg + glu | type, data=Pima.tr2, outer=TRUE, scales=list(relation="free"), xlab="Measure") @ % The distribution for \texttt{bmi} gives the impression that it has a different shape, between rows where one or more values was missing and rows where no values were missing, at least for \texttt{type=="Yes"}. The bootstrap methodology can be used to give a rough check of the consistency of apparent differences under sampling variation. The idea is to treat the sample as representative of the population, and takes repeated with replacement (bootstrap'') samples from it. The following compares the qq-plots between rows that had missing data (\texttt{anymiss=="miss"}) and rows that were complete (\texttt{anymiss=="nomiss"}), for a single bootstrap sample, separately for the non-diabetics (\texttt{type=="No"}) and the diabetics (\texttt{type=="Yes"}). <>= rownum <- 1:dim(Pima.tr2)[1] # generate row numbers chooserows <- sample(rownum, replace=TRUE) qq(anymiss ~ bmi | type, data=Pima.tr2[chooserows, ], auto.key=list(columns=2)) @ % Wrap these lines of code in a function. Repeat the formation of the bootstrap samples and the plots several times. Does the shift in the distribution seem consistent under repeating sampling? \end{fmpage} Judgements based on examination of graphs are inevitably subjective. They do however make it possible to compare differences in the shapes of distributions. Here, the shape difference is of more note than any difference in mean or median. \vspace*{3pt} \enlargethispage{24pt} \begin{fmpage}{36pc} \exhead{3} In the data frame \texttt{nswdemo} (\textit{DAAGxtras} package), compare the distribution of \texttt{re78} for those who received work training (\texttt{trt==1}) with controls (\texttt{trt==0}) who did not. <>= library(DAAGxtras) densityplot(~ re78, groups=trt, data=nswdemo, from=0, auto.key=list(columns=2)) @ % \end{fmpage} \begin{fmpage}{36pc} \exhead{3, continued} The distributions are highly skew. A few very large values may unduly affect the comparison. A reasonable alternative is to compare values of \texttt{log(re78+23)}. The value 23 is chosen because it is half the minimum non-zero value of \texttt{re78}. Here is the density plot. <>= unique(sort(nswdemo$re78))[1:3] # Examine the 3 smallest values densityplot(~ log(re78+23), groups=trt, data=nswdemo, auto.key=list(columns=2)) @ % Do the distribution for control and treated have similar shapes? \end{fmpage} \begin{fmpage}{36pc} \exhead{4} Now examine the displacement, under repeated bootstrap sampling, of one mean relative to the other. Here is code for the calculation: <>= twoBoot <- function(n=999, df=nswdemo, ynam="re78", gp="trt"){ fac <- df[, gp]; if(!is.factor(fac))fac <- factor(fac) if(length(levels(fac)) != 2) stop(paste(gp, "must have 2 levels")) y <- df[, ynam] d2 <- c(diff(tapply(y, fac, mean)), rep(0, n)) for(i in 1:n){ chooserows <- sample(1:length(y), replace=TRUE) faci <- fac[chooserows]; yi <- y[chooserows] d2[i+1] <- diff(tapply(yi, faci, mean)) } d2 } ## d2 <- twoBoot() quantile(d2, c(.025,.975)) # 95% confidence interval @ % \end{fmpage} Note that a confidence interval should not be interpreted as a probability statement. It takes no account of prior probability. Rather, 95\% of intervals that are calculated in this way can be expected to contain the true probability. \enlargethispage{27pt} \section{Use of the Permutation Distribution as a Standard} \begin{fmpage}{36pc} \exhead{5} If the difference is entirely due to sampling variation, then permuting the treatment labels will give another sample from the same null distribution. The permutation distribution is the distribution of differences of means from repeated samples, obtained by permuting the labels. This offers a standard against which to compare the difference between treated and controls. Does the observed difference between treated and controls seem extreme'', relative to this permutation distribution? Note that the difference between \texttt{treat==1} and \texttt{treat==1} might go in either direction. Hence the multiplication of the tail probability by 2. Here is code: <>= dnsw <- numeric(1000); y <- nswdemo$re78; treat <- nswdemo$trt dnsw[1] <- mean(y[treat==1]) - mean(y[treat==0]) for(i in 2:1000){ trti <- sample(treat) dnsw[i] <- mean(y[trti==1]) - mean(y[trti==0]) } 2*min(sum(d2<0)/length(d2), sum(d2>0)/length(d2)) # 2-sided comparison @ % Replace \texttt{re78} with \texttt{log(re78+23)} and repeat the calculations. \end{fmpage}