TextPreprocessor¶

class TextPreprocessor.TextPreprocessor(inputCol=None, map=None, normFunc=None, outputCol=None)[source]¶

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters:	inputCol (str) – The name of the input column map (dict) – Map of substring match to replacement normFunc (str) – Name of normalization function to apply outputCol (str) – The name of the output column

getInputCol()[source]¶

Returns:	The name of the input column
Return type:	str

Returns:	Map of substring match to replacement
Return type:	dict

getNormFunc()[source]¶

Returns:	Name of normalization function to apply
Return type:	str

getOutputCol()[source]¶

Returns:	The name of the output column
Return type:	str

classmethod read()[source]¶: Returns an MLReader instance for this class.

setInputCol(value)[source]¶

Parameters:	inputCol (str) – The name of the input column

setMap(value)[source]¶

Parameters:	map (dict) – Map of substring match to replacement

setNormFunc(value)[source]¶

Parameters:	normFunc (str) – Name of normalization function to apply

setOutputCol(value)[source]¶

Parameters:	outputCol (str) – The name of the output column

setParams(inputCol=None, map=None, normFunc=None, outputCol=None)[source]¶

Set the (keyword only) parameters

Parameters:	inputCol (str) – The name of the input column map (dict) – Map of substring match to replacement normFunc (str) – Name of normalization function to apply outputCol (str) – The name of the output column