Text to Matrix: Feature Encoding
Algorithms can't read 'Blue'. Learn to translate human concepts into machine vectors.
encoding_simulator
1 / 5
Code Workspace
Red[1, 0, 0]
Green[0, 1, 0]
Blue[0, 0, 1]
Encoder:Algorithms only understand numbers. If you have a column with 'Red', 'Green', and 'Blue', you MUST convert it. But how?
Translation Matrix
Unlock nodes by understanding text-to-number translation.
One-Hot Encoding
Transforms a categorical column into multiple binary columns (1s and 0s). Use this when the categories have no inherent algebraic order (e.g., 'Apple', 'Banana', 'Cherry').
Encoding Logic Check
If you have a column with 50 different US States, and you One-Hot encode it, what happens to your dataset's shape?
Encoding Glossary
- Nominal Data
- Categories without a logical ranking or order (e.g., Dog, Cat, Bird). Always use One-Hot Encoding.
- Ordinal Data
- Categories with a strict logical ranking (e.g., Bronze, Silver, Gold). Always use Label/Ordinal Encoding.