I understand the main principle of bagging and boosting for classification and regression trees. My doubts are about the optimization of the hyperparameters, especially the depth of the trees
First question: why we are supposed to use weak learners for boosting (high bias) whereas we have to use deep trees for bagging (high variance) ? - Honestly, I'm not sure about the second one, just heard it once and never seen any documentation about it.
Second question : why and how can it happen that we get better results in the grid searches for gradient boosting with deeper trees than weak learners (and similarly with weak learners than deeper trees in random forest)?