Categorical Variables and Cardinality

twitter logo github logo ・1 min read

One approach to deal with categorical variables is One-Hot Encoding.
But there is an important thing to keep in mind when using it.
Cardinality and it means the number of unique values in a column.

It's always better to create a one-hot encoding for columns with lower cardinality because in a large dataset one-hot encoding can expand the size of the dataset.

High cardinality columns can either be dropped from the dataset, or we can use label encoding.

Alt Text

twitter logo DISCUSS
Classic DEV Post from Jan 8

Smart Time Management and Productivity Guide via Weekly Reviews.

As we all know being a competent software developer or engineer means continuously learning throughout your career. It’s great! That makes this profession exciting and allows to escape everyday routine at work. In fact, it’s not a job anymore - it’s a lifestyle 👩‍💻👨‍💻

Mohammed Galalen profile image
Software Engineer | Machine / Deep Learning Enthusiast