## How to set class weights for imbalanced classes in Keras?

45

20

I know that there is a possibility in Keras with the class_weights parameter dictionary at fitting, but I couldn't find any example. Would somebody so kind to provide one?

By the way, in this case the appropriate praxis is simply to weight up the minority class proportionally to its underrepresentation?

Is there a new updated method out using Keras ? why is the dictionary consisting of three classes and for class: 0: 1.0 1: 50.0 2: 2.0 ???? shouldn't: 2:1.0 as well ?Chuck 2017-09-09T14:00:46.490

41

If you are talking about the regular case, where your network produces only one output, then your assumption is correct. In order to force your algorithm to treat every instance of class 1 as 50 instances of class 0 you have to:

1. Define a dictionary with your labels and their associated weights

class_weight = {0 : 1.,
1: 50.,
2: 2.}

2. Feed the dictionary as a parameter:

model.fit(X_train, Y_train, nb_epoch=5, batch_size=32, class_weight = class_weight)


Also have a look at https://github.com/fchollet/keras/issues/3653 if you're working with 3D data.

herve 2017-04-26T09:12:48.837

For me it gives a error dic don't has shape attribute.Flávio Filho 2017-05-23T00:11:08.297

I believe Keras could be changing the way this works, this is for the version of August 2016. I will verify for you in a weeklayser 2017-05-25T14:12:47.580

Does this work for one-hot-encoded labels?megashigger 2018-01-08T19:49:49.183

38

You could simply implement the class_weight from sklearn:

1. Let's import the module first

from sklearn.utils import class_weight

2. In order to calculate the class weight do the following

class_weight = class_weight.compute_class_weight('balanced', np.unique(y_train), y_train)

3. Thirdly and lastly add it to the model fitting

model.fit(X_train, y_train, class_weight=class_weight)

4For me, class_weight.compute_class_weight produces an array, I need to change it to a dict in order to work with Keras. More specifically, after step 2, use class_weight_dict = dict(enumerate(class_weight))C.Lee 2017-10-13T04:33:48.690

2This doesn't work for me. For a three class problem in keras y_train is (300096, 3) numpy array. So the class_weight= line gives me TypeError: unhashable type: 'numpy.ndarray'Lembik 2017-12-14T10:25:09.850

18

I use this kind of rule for class_weight :

import numpy as np
import math

# labels_dict : {ind_label: count_label}
# mu : parameter to tune

def create_class_weight(labels_dict,mu=0.15):
total = np.sum(labels_dict.values())
keys = labels_dict.keys()
class_weight = dict()

for key in keys:
score = math.log(mu*total/float(labels_dict[key]))
class_weight[key] = score if score > 1.0 else 1.0

return class_weight

# random labels_dict
labels_dict = {0: 2813, 1: 78, 2: 2814, 3: 78, 4: 7914, 5: 248, 6: 7914, 7: 248}

create_class_weight(labels_dict)


math.log smooths the weights for very imbalanced classes ! This returns :

{0: 1.0,
1: 3.749820767859636,
2: 1.0,
3: 3.749820767859636,
4: 1.0,
5: 2.5931008483842453,
6: 1.0,
7: 2.5931008483842453}


2Why use log instead of just dividing the count of samples for a class by the total number of samples? I am assume there is something I don't understand goes into the param class_weight on model.fit_generator(...)startoftext 2017-05-04T03:11:19.593

@startoftext That's how I did it, but I think you have it inverted. I used n_total_samples / n_class_samples for each class.colllin 2017-10-19T17:34:11.100

In your example class 0 (has 2813 examples) and class 6 (has 7914 examples) have weight exactly 1.0. Why is that? The class 6 is few times bigger! You would want class 0 be upscaled and class 6 downscaled to bring them to the same level.Vladislavs Dovgalecs 2018-01-16T20:55:01.223

9

To weight all classes equally, you can now simply set class_weight to "auto" like so:

model.fit(X_train, Y_train, nb_epoch=5, batch_size=32, class_weight = 'auto')

1I couldn't find any reference of class_weight='auto' in Keras documentation nor in the source code. Can you show us where you found this?Fábio Perez 2017-03-03T19:26:50.427

1

This answer is probably wrong. Check this issue: https://github.com/fchollet/keras/issues/5116

Fábio Perez 2017-03-03T19:29:26.923

Odd. I was using class_balanced='auto' at the time I posted the comment, but I can't find reference to it now. Perhaps it has been changed as Keras has been rapidly evolving.David Groppe 2017-03-16T15:12:13.207