is it wrong to use average='weighted' when having only 2 classes?


In the book 'Text Analytics with Python', the author provides

In the code of the .py he does:

metrics.precision_score(true_labels, predicted_labels,average='weighted')

I have two questions regarding it:

1- per the documentation average='weighted' shall only be used when having more than 2 classes, right? Why is he using average='weighted' when using only 2 classes?

2- Why do I get different results when I run?

print('Recall:', metrics.recall_score(test_sentiments,predicted_sentiments,pos_label='positive'))
print('Recall:', metrics.recall_score(test_sentiments,predicted_sentiments,average='weighted'))

(I only have 2 classes in the data) enter image description here

ps: I think that by using average='weighted' he is giving the wrong result, because the code doesn't know which one is the positive class, here is a like to the code of


Posted 2020-07-23T12:32:51.350

Reputation: 171



As you already know, a precision score (or recall, or f-score) is for a single class, and in the function the argument pos_label says which class.

Now I'm going to guess that when pos_label is not provided and instead average is provided the function probably calculates the metric for every class and then returns the average of these values.

A weighted average can be calculated with any number of classes, and since no pre-defined weights are provided we can reasonably assume that the function takes the proportion of the two classes as weights. So the result of:


is probably the weigthed average (by proportion of instances) of:


If I'm not mistaken, this is equivalent to the micro-average recall.


Posted 2020-07-23T12:32:51.350

Reputation: 12 600