I'm using an experimental design to test the robustness of different classification methods, and now I'm searching for the correct definition of such design.
I'm creating different subsets of the full dataset by cutting away some samples. Each subset is created independently with respect to the others. Then, I run each classification method on every subset. Finally, I estimate the accuracy of each method as how many classifications on subsets are in agreement with the classification on the full dataset. For example:
Classification-full 1 2 3 2 1 1 2 Classification-subset1 1 2 2 3 1 Classification-subset2 2 3 1 1 2 ... Accuracy 1 1 1 1 0.5 1 1
Is there a correct name to this methodology? I thought it can fall under bootstrapping but I'm not sure about this.