I have an unbalanced dataset, with 3 classes, with 60% of class 1, 38% of class 2, and 2% of class 3.
I don't want to generate more examples of class 3, and I cannot get more examples of class 3.
The problem is that I need to choose between RandomForest, and ExtraTree (this is homework), and explain why I choose one of these.
So I choose the Random Forest classifier, but I am not sure if my assumptions are right or no.
I choose that, because, the split of extra tree is random, so the probabilities of the pick some examples of class 3 are low, and because I think (this is the real question) that because Random is more high-variance than Extra tree, can be more useful because the high variance can help with the dataset is unbalance.
So are this two assumption especially the last one, correct? I choose correctly random forest over extra tree?