Exercise 1: Minimize the weighted sum of two random variables
Using basic statistical properties of the variance, as well as single- variable calculus, derive (5.6). In other words, prove that α given by (5.6) does indeed minimize Var(αX+(1−α)Y)
Using properties of variance we have
Var(αX+(1−α)Y)=α2σX2+(1−α)2σY2+2α(1−α)σXY
Taking the derivative with respect to α, set to zero
2ασX2−2(1−α)σY2+2(1−2α)σXY=0
solve for α to find
α=σX2+σY2−2σXYσY2−σXY
Exercise 2: Derive the probability an observation appears in a bootstrap sample
a.
What is the probability that the first bootstrap observation is not the jth observation from the original sample? Justify your answer.
P(first bootstrap observation is notj−th observation)==1−P(first bootstrap observation isj−th observation)=1−n1
Since the boostrap observations are chosen uniformly at random
b.
What is the probability that the second bootstrap observation is not the jth observation from the original sample?
The probability is still 1−n1 since the bootstrap samples are drawn with replacement
c.
Let
A=thej−th observation is not in the bootstrap sampleAk=thek−th bootstrap observation is not thej−th observation
Then since the bootstrap observations are drawn uniformly at random the Ak are independent and P(Ak)=1−n1 hence
Exercise 4: Estimate the standard deviation of a predicted reponse
Suppose given (X,Y) we predict Y^. This is an estimator [^0]. To estimate its standard error using data (x1,y1),…,(xn,yn) use the “plug-in” estimator 1.
se^(Y^)=n1i=1∑n(y^i−y^)2
where y^i is the predicted value for xi and y^ is the mean predicted value.
In other words, use the sample standard deviation of the predicted values.
An estimator is a statistic (a function of the data) used to estimate a population quantity – it is a random variable corresponding to the statistical learning method we use and dependent on the observed data. ↩