Standard Scaler
Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.
This operation is ported from Spark ML.
For a comprehensive introduction, see
Spark documentation.
For scala docs details, see
org.apache.spark.ml.feature.StandardScaler documentation.
Since: Seahorse 1.0.0
Port |
Type Qualifier |
Description |
0 | DataFrame | The input DataFrame . |
Output
Port |
Type Qualifier |
Description |
0 | DataFrame | The output DataFrame . |
1 | Transformer | A Transformer that allows to apply the operation on other DataFrames using a Transform. |
Parameters
Name |
Type |
Description |
input column |
SingleColumnSelector |
The input column name. |
output |
SingleChoice |
Output generation mode. Possible values: ["replace input column", "append new column"] |
with mean |
Boolean |
Centers the data with mean before scaling. |
with std |
Boolean |
Scales the data to unit standard deviation. |
Example
Parameters
Name |
Value |
input column |
"features" |
output |
append new column |
output column |
"scaled" |
with mean |
false |
with std |
true |
features |
[-2.0,2.3,0.0] |
[0.0,-5.1,1.0] |
[1.7,-0.6,3.3] |
Output
features |
scaled |
[-2.0,2.3,0.0] |
[-1.0798984943120777,0.6168340914150375,0.0] |
[0.0,-5.1,1.0] |
[0.0,-1.3677625505289963,0.5909681092664519] |
[1.7,-0.6,3.3] |
[0.9179137201652661,-0.16091324123870546,1.9501947605792913] |