Text to Matrix: Feature Encoding

Algorithms can't read 'Blue'. Learn to translate human concepts into machine vectors.

encoding_simulator
1 / 5
Code Workspace
Red[1, 0, 0]
Green[0, 1, 0]
Blue[0, 0, 1]

Encoder:Algorithms only understand numbers. If you have a column with 'Red', 'Green', and 'Blue', you MUST convert it. But how?

Translation Matrix

Unlock nodes by understanding text-to-number translation.

One-Hot Encoding

Transforms a categorical column into multiple binary columns (1s and 0s). Use this when the categories have no inherent algebraic order (e.g., 'Apple', 'Banana', 'Cherry').

Encoding Logic Check

If you have a column with 50 different US States, and you One-Hot encode it, what happens to your dataset's shape?


Encoding Glossary

Nominal Data
Categories without a logical ranking or order (e.g., Dog, Cat, Bird). Always use One-Hot Encoding.
Ordinal Data
Categories with a strict logical ranking (e.g., Bronze, Silver, Gold). Always use Label/Ordinal Encoding.