IDF¶
Compute the Inverse Document Frequency (IDF) given a collection of documents.
Input¶
It takes in a DataFrame and transforms it to another DataFrame
Output¶
The output DataFrame contains a new column of type vector, It takes feature vectors (generally created from HashingTF) as input and scales each column. Intuitively, it down-weights columns which appear frequently in a corpus.
Type¶
ml-transformer
Class¶
fire.nodes.ml.NodeIDF
Fields¶
| Name | Title | Description |
|---|---|---|
| inputCol | Input Column | Input Column Name |
| outputCol | Output Column | Output column name |
| minDocFreq | MinDocFreq | The minimum of documents in which a term should appear. |