If then . So if we take , then for all
If then . Then expanding
we see
\begin{align}
a_1 &= \beta_0 - \beta_4 \xi^3
b_1 &= \beta_1 + 3\beta_4\xi^2
c_1 &= \beta_2 - 3\beta_4\xi
d_1 &= \beta_3 + \beta_4
\end{align}
\begin{align}
f_2(\xi) &= (\beta_0 - \beta_4 \xi^3) + (\beta_1 + 3\beta_4\xi^2)\xi + (\beta_2 - 3\beta_4\xi)\xi^2 + (\beta_3 + \beta_4)\xi^3
&= \beta_0 - \beta_4\xi^3 +\beta_1\xi + 3\beta_4\xi^3 + \beta_2\xi^2 - 3\beta_4\xi^3 + \beta_3\xi^3 + \beta_4\xi^3
&= \beta_0 + \beta_1\xi + \beta_2\xi^2 + \beta_3\xi^3
&= f_1(\xi)
\end{align}
\begin{align}
f’_2(\xi) &= (\beta_1 + 3\beta_4\xi^2) + 2(\beta_2 - 3\beta_4\xi)\xi + 3(\beta_3 + \beta_4)\xi^2
&= \beta_1 + 3\beta_4\xi^2 + 2\beta_2\xi - 6\beta_4\xi^2 + 3\beta_3\xi^2 + 3\beta_4\xi^2
&= \beta_1 + 2\beta_2\xi + 3\beta_3\xi^2
&= f’_1(\xi)
\end{align}
\begin{align}
f’‘_2(\xi) &= 2(\beta_2 - 3\beta_4\xi) + 6(\beta_3 + \beta_4)\xi
&= 2\beta_2 + 6\beta_4\xi
&= f’‘_1(\xi)
\end{align}
We’re just going to describe the solutions instead of drawing them.
If and , then the integral term is minimized by a zero integral, hence by . So is the zero function.
In this case, the integral is minimized by , so is a constant function. Likely
This is the smoothing spline discussed in the chapter – is the ordinary least squares line.
Now the integral penalty forces , so is necessarily polynomial of degree leqslant 2. It’s clear that it necessarily has degree 2 – one can imagine cases where a linear minimizes the RSS term, and cases where a quadratic does. In general, a quadratic has more freedom, so averaging over datasets, the quadratic will have smaller RSS, and hence will be quadratic.
In this case, the condition is irrelevant. Now we are minimizing the RSS over all functions , so will be any function which has zero RSS (i.e. which has for all ). For example, the interpolation spline, or a step (piecewise constant) function which passes through the y_i will have zero RSS. Such a function isn’t unique.
skip
skip
As , the integral much approach 0. Then approaches a polynomial of degree at most 2, and approaches a polynomial of degree 3. Since , we expect
Since , we expect
For , the roughness penalty vanishes and both can be any function with zero RSS (see Exercise 2 e. above). Such functions aren’t uniquely defined, so in the absence of a rule for choosing in this case, we can’t answer the question.