H2OWord2Vec¶

H2O Word2Vec

Input¶

It takes in a DataFrame as input

Type¶

transform

Class¶

fire.nodes.h2o.NodeH2OWord2vec

Fields¶

Name	Title	Description
min_word_freq	Min Word Freq	Specify an integer for the minimum word frequency. Word2vec will discard words that appear less than this number of times.
vec_size	Vec Size	Specify the size of word vectors.
window_size	Window Size	This specifies the size of the context window around a given word.
epochs	Epochs	Specify the number of training iterations to run.
init_learning_rate	Init Learning Rate	Set the starting learning rate.
sent_sample_rate	Sent Sample Rate	Set the threshold for the occurrence of words. Those words that appear with higher frequency in the training data will be randomly down-sampled. An ideal range for this option 0, 1e-5.
aggregateMethod	AggregateMethod	Specifies how to aggregate sequences of words.

Details¶

The Word2vec algorithm takes a text corpus as an input and produces the word vectors as output. The algorithm first creates a vocabulary from the training text data and then learns vector representations of the words.

More details are available at : http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/word2vec.html#