StandardScaler

StandardScaler transforms a dataset of Vector rows, normalizing each feature to have unit standard deviation and/or zero mean.

Input

It takes in a DataFrame as input and transforms it to another DataFrame

Output

It adds a new column containing the transform of the input Vector column to unit standard deviation and/or zero mean features to the incoming DataFrame.

Type

ml-transformer

Class

fire.nodes.ml.NodeStandardScaler

Fields

Name Title Description
inputCol Input Column The input column name
outputCol Output Column The output column name
withMean With Mean Centers the data with mean before scaling.
withStd With Standard Dev Scales the data to unit standard deviation

Details

StandardScaler transforms a dataset of Vector rows, normalizing each feature to have unit standard deviation and/or zero mean.

StandardScaler is an Estimator which can be fit on a dataset to produce a StandardScalerModel; this amounts to computing summary statistics. The model can then transform a Vector column in a dataset to have unit standard deviation and/or zero mean features.

If the standard deviation of a feature is zero, it will return default 0.0 value in the Vector for that feature.

More at Spark MLlib/ML docs page : http://spark.apache.org/docs/latest/ml-features.html#standardscaler