We know that one hot encoding increases the dimensionality of a dataset, but label encoding doesn’t. How?

4 years ago
Machine Learning

When we use one-hot encoding, there is an increase in the dimensionality of a dataset. The reason for the increase in dimensionality is that, for every class in the categorical variables, it forms a different variable.
Example: Suppose, there is a variable ‘Color.’ It has three sub-levels as Yellow, Purple, and Orange. So, one hot encoding ‘Color’ will create three different variables as Color, Yellow, Color.Porple, and Color.Orange.
In label encoding, the sub-classes of a certain variable get the value as 0 and 1. So, we use label encoding only for binary variables.
This is the reason that one hot encoding increases the dimensionality of data and label encoding does not.

0
Sanisha Maharjan
Jan 11, 2022
More related questions

Questions Bank

View all Questions