StandardScaler¶
StandardScaler transforms a dataset of Vector rows, normalizing each feature to have unit standard deviation and/or zero mean.
Input¶
It takes in a DataFrame as input and transforms it to another DataFrame
Output¶
It adds a new column containing the transform of the input Vector column to unit standard deviation and/or zero mean features to the incoming DataFrame.
Type¶
ml-transformer
Class¶
fire.nodes.ml.NodeStandardScaler
Fields¶
| Name | Title | Description |
|---|---|---|
| inputCol | Input Column | The input column name |
| outputCol | Output Column | The output column name |
| withMean | With Mean | Centers the data with mean before scaling. |
| withStd | With Standard Dev | Scales the data to unit standard deviation |
Details¶
StandardScaler transforms a dataset of Vector rows, normalizing each feature to have unit standard deviation and/or zero mean.
StandardScaler is an Estimator which can be fit on a dataset to produce a StandardScalerModel; this amounts to computing summary statistics. The model can then transform a Vector column in a dataset to have unit standard deviation and/or zero mean features.
If the standard deviation of a feature is zero, it will return default 0.0 value in the Vector for that feature.
More at Spark MLlib/ML docs page : http://spark.apache.org/docs/latest/ml-features.html#standardscaler