I am trying to learn data modeling by working on a dataset from Kaggle competition. As the competition was closed 2 years back, I am asking my question here. The competition uses AUC-ROC as the evaluation metric. This is a classification problem with 5 labels. I am modeling it as 5 independent binary classification problems. Interestingly, the data is highly imbalanced across labels. In one case, there is an imbalance of 333:1. I did some research into interpreting the AUC-ROC metric. During my research, I found this and this. Both these articles basically say that AUC-ROC is not a good metric for an imbalanced data set. So, I am wondering why would they be using this metric to evaluate models in the competition? Is it even a reasonable metric in such a context? If yes, why?