Multiclass classification on imbalanced dataset : Accuracy or micro F1 or macro F1



I have a multiclass classification problem. Further, an instance can be assigned to exactly one class. My dataset is highly imbalanced. I know that accuracy is not a good metric to use in this case because one can simply predict the high frequent class and get a good score. I understand that micro F1 is better than macro F1 for multiclass classification problem but it turns out that the micro F1 score is the same as accuracy score. So, the whole idea of looking into alternative metric (i.e. micro F1) instead of accuracy has gone full circle.

Should I be using macro F1 instead?


Posted 2019-05-11T22:40:28.447

Reputation: 151

Without hesitating, especially in the cases you are interested in the positive class (usually in the minority class also). – user_007 – 2020-02-18T17:10:56.210



There are two not so widely known in the data science community metrics that work well for imbalanced data and can be used for multi-class data: Cohen's kappa and Matthews Correlation Coefficient (MCC).

Cohen's kappa is a statistic that was designed to measure inter-annotator agreement, but it can be used to measure agreement between the ground truth and a prediction. There are number of explanations online e.g. on Wikipedia or here and it is implemented in scikit-learn.

MMC was initially designed for a binary classification but then generalized for multi-class data. There are also multiple online sources for MCC, e.g. Wikipedia and here, and it is implemented in scikit-learn.

Hope this helps.

David Makovoz

Posted 2019-05-11T22:40:28.447

Reputation: 351