One Hot Encoder
Maps a column of category indices to a column of binary vectors.
This operation is ported from Spark ML.
For a comprehensive introduction, see
Spark documentation.
For scala docs details, see
org.apache.spark.ml.feature.OneHotEncoder documentation.
Since: Seahorse 1.0.0
Port |
Type Qualifier |
Description |
0 | DataFrame | The input DataFrame . |
Output
Port |
Type Qualifier |
Description |
0 | DataFrame | The output DataFrame . |
1 | Transformer | A Transformer that allows to apply the operation on other DataFrames using a Transform. |
Parameters
Name |
Type |
Description |
drop last |
Boolean |
Whether to drop the last category in the encoded vector. |
operate on |
InputOutputColumnSelector |
The input and output columns for the operation. |
Example
Parameters
Name |
Value |
drop last |
true |
operate on |
one column |
input column |
"labels" |
output |
append new column |
output column |
"encoded" |
features |
labels |
a |
0.0 |
a |
0.0 |
b |
1.0 |
c |
2.0 |
a |
0.0 |
b |
1.0 |
a |
0.0 |
a |
0.0 |
c |
2.0 |
Output
features |
labels |
encoded |
a |
0.0 |
(2,[0],[1.0]) |
a |
0.0 |
(2,[0],[1.0]) |
b |
1.0 |
(2,[1],[1.0]) |
c |
2.0 |
(2,[],[]) |
a |
0.0 |
(2,[0],[1.0]) |
b |
1.0 |
(2,[1],[1.0]) |
a |
0.0 |
(2,[0],[1.0]) |
a |
0.0 |
(2,[0],[1.0]) |
c |
2.0 |
(2,[],[]) |