RegexTokenizer¶

This node creates a new DataFrame by the process of taking text (such as a sentence) and breaking it into individual terms (usually words) based on regular express

Type¶

transform

Class¶

fire.nodes.etl.NodeRegexTokenizer

Fields¶

Name	Title	Description
inputCol	Column	input column for tokenizing
outputCol	Tokenized Column	New output column after tokenization
pattern	Pattern	The regex pattern used to match delimiters
gaps	Gaps	Indicates whether the regex splits on gaps