H2OWord2Vec¶
H2O Word2Vec
Input¶
It takes in a DataFrame as input
Type¶
transform
Class¶
fire.nodes.h2o.NodeH2OWord2vec
Fields¶
| Name | Title | Description |
|---|---|---|
| min_word_freq | Min Word Freq | Specify an integer for the minimum word frequency. Word2vec will discard words that appear less than this number of times. |
| vec_size | Vec Size | Specify the size of word vectors. |
| window_size | Window Size | This specifies the size of the context window around a given word. |
| epochs | Epochs | Specify the number of training iterations to run. |
| init_learning_rate | Init Learning Rate | Set the starting learning rate. |
| sent_sample_rate | Sent Sample Rate | Set the threshold for the occurrence of words. Those words that appear with higher frequency in the training data will be randomly down-sampled. An ideal range for this option 0, 1e-5. |
| aggregateMethod | AggregateMethod | Specifies how to aggregate sequences of words. |
Details¶
The Word2vec algorithm takes a text corpus as an input and produces the word vectors as output. The algorithm first creates a vocabulary from the training text data and then learns vector representations of the words.
More details are available at : http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/word2vec.html#