Maths question on mean squared error being dervied to bias and variance

0

I am reading a book and have difficulty in understanding the math on bias- variance tradeoff. Below is the section that I am having trouble with:

Given a set of training samples $x_1, x_2, ..., x_n$ and their targets $y_1, y_2, ..., y_n$, we want to find a regression function, $\hat{y}(x)$, which estimates the true relation $y(x)$ as correctly as possible. We measure the error of estimation, how good (or bad) the regression model is by mean squared error ($MSE$): Formula

I can derive mean squared error with partial derivative and the concept of slope. I also understand that $MSE$ is to minimize the total error. I also understand basics statics on expected value.

Yet, I have been stuck in finding the relevant math and statistical concepts behind this formula for a week.

The question is, what are the relevant math and statistical concepts behind this formula?

For example, how

$MSE = E[(y-\hat{y} )^2]$

becomes:

$= E[(y-E[\hat{y} ] + E[\hat{y}] - \hat{y} )^2]$

Thank you! I can see that the first component after adding and subtracting E[y^] is unchanged. Then the formula operated according to $(a+b)^2 = a^2 + 2ab + b^2 $ where

$2ac = +E[2(y - E[\hat{y}])(E[\hat{y}] - \hat{y})]$

Why 2ac becomes

$2(y - E[\hat{y}])(E[\hat{y}] - E[\hat{y}]) $

Carch

Posted 2018-07-28T05:39:58.563

Reputation: 13

Because adding and subtracting $E[\hat y]$ does not affect the result. – Emre – 2018-07-28T18:12:10.487

Thank you. However, I am still having trouble to understand the following maths. I can see that the first component after adding and subtracting E[y^] is unchanged. Then the formula operated according to (a+b)^2 = a^2 + 2ab + b^2 where 2ac = +E[2(y - E[y^])(E[y^] - y^)]. Why 2ac becomes 2(y - E[y^])(E[y^] - E[y^]) ? – Carch – 2018-07-29T04:24:09.013

Answers

1

Because it's a costant, everything that is a costant value remains unchanged by the E, that's why you can "move" it outside.

For example:

enter image description here

If y is a costant or it is known, the E doesn't affect it, so you can "move" it outside the symbol, and write just y.

The only thing that is unknown is the estimator yhat, in fact you have enter image description here and not just enter image description here.

That's what's happening here.

I found this on CrossValidated that might be more clear.

RLave

Posted 2018-07-28T05:39:58.563

Reputation: 258