Is Gini coefficient a good metric for measuring predictive model performance on highly imbalanced data


I am evaluating a Credit Risk model that predicts the estimated likelihood of customers defaulting on their mortgage accounts. The model is a Logistic Regression estimator and was built by another team. They use the Gini metric to measure the performance of the model. They achieved 87%. Upon evaluation, I found that the recall was 51% whilst the error rate of the non rare event class (do not default) was 0.9%. Am I correct in thinking that the Gini is actually a misleading metric in this case because it doesn't really show the extremely poor predictive performance of the rare event class? I have questioned them about this and tried to recommend them to use precision/recall metrics as well as confusion matrices and a precision-recall trade-off graph but they quickly dismissed me.

Any advice would be much appreciated.


Posted 2017-06-15T20:15:12.750

Reputation: 33



The Gini Coefficient can also be expressed in terms of the area under the ROC curve (AUC): G = 2*AUC -1 link. The ROC curve, on the other hand, is influenced by class imbalance through the false positive rate FP/(FP+TN). If the number of negatives is a lot larger, this could be a potential issue.

In short, the Gini Coefficient has similar pros and cons as the AUC ROC metric.


Posted 2017-06-15T20:15:12.750

Reputation: 5 477

Nonsense. One of the main motivation of auc is to work well for data with skewed class priors and miss classification costs. – cs0815 – 2021-02-11T22:15:13.910

What exactly is nonsense? The relationship between the gini coefficient and the roc auc is a mathematical fact. If the imbalance is severely skewed towards the negative class, there are scenarios where the roc auc might be overly optimistic. I'm simply pointing that out as a "potential issue" not saying it always has to be. – oW_ – 2021-02-12T00:40:12.337

The auc, based on the roc curve, is chosen when there are skewed class priors and miss classification costs. Thus I do not aggree with this part: "influenced by class imbalance" (sorry I should not have used the word nonsense). – cs0815 – 2021-02-12T08:27:30.203


Gini coefficient shouldn't be to my understanding a bad mertric for imbalanced classification, because it is related to AUC, which works just fine. Maybe it was gini impurity not coefficient. Check your AUC of the predictions once. Also Area under the PR curve is a better metric for imbalanced classification than AUC, maybe you should see that too.

Dhruv Mahajan

Posted 2017-06-15T20:15:12.750

Reputation: 338


Credit models do not do a great job of predicting individual defaults, and the error rates are usually high. That is, a fairly high proportion of dubious borrowers do not default. One can always reduce this proportion by making the cutoff more generous, so that only the worst borrowers are left in the "bad" pool; but the necessary tradeoff is that more borrowers must be put into the "good" pool, so more defaults occur in the "good" pool.

The Gini (or the roughly equivalent AUC) is a reasonable tool for assessing the performance of the model across the whole range of credit cutoffs, but in practice this is not usually what we want. We really want to make our lending business profitable, which means we have to consider how much profit we make from good mortgages and how much we lose from defaults. The best model is the one that gives the best tradeoff between these. This has nothing to do with our success at predicting individual defaults, which is why the Gini is not really useful.

Because the costs and profit numbers are specific to each lender, it is quite possible that Model A will work better than Model B for one lender, while Model B will work better than Model A for another lender. There is no model that is best for every lender.


Posted 2017-06-15T20:15:12.750

Reputation: 21