Why does frequency encoding work?



Frequency encoding is a widely used technique in Kaggle competitions, and many times proves to be a very reasonable way of dealing with categorical features with high cardinality. I really don't understand why it works.

Does it work in very specific cases where frequencies are correlated with the target or is it more general? What's the rationale behind it?

David Masip

Posted 2019-11-25T15:36:36.253

Reputation: 5 101

maybe this answer

– Nikos M. – 2020-07-26T08:26:24.413

@NikosM. Can you copy that answer here? – Itamar Mushkin – 2020-07-30T07:14:11.277



Check this post.

In the cases where the frequency is related somewhat with the target variable, it helps the model to understand and assign the weight in direct and inverse proportion, depending on the nature of the data.

Check also this thread.

What's the rationale behind it?

High cardinality may result in dimensionality curse and actually decrease the quality of your model.

Piotr Rarus

Posted 2019-11-25T15:36:36.253

Reputation: 721

1The second thread talks about something different I think. I understand it is talking about target encdoing. And about the dimensionality curse, target encoding also helps solving it but frequency encoding is preferred sometimes. Why is that? – David Masip – 2019-11-25T16:13:13.403

1Frequency may convey some actual domain knowledge. Stating it explicitly may help and keeps dimensionality low. All of this is a matter of debate and I wouldn't blindly follow it. – Piotr Rarus – 2019-11-25T16:15:19.633