TextPreprocessor

class TextPreprocessor.TextPreprocessor(inputCol=None, map=None, normFunc=None, outputCol=None)[source]

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters:
  • inputCol (str) – The name of the input column
  • map (dict) – Map of substring match to replacement
  • normFunc (str) – Name of normalization function to apply
  • outputCol (str) – The name of the output column
getInputCol()[source]
Returns:The name of the input column
Return type:str
static getJavaPackage()[source]

Returns package name String.

getMap()[source]
Returns:Map of substring match to replacement
Return type:dict
getNormFunc()[source]
Returns:Name of normalization function to apply
Return type:str
getOutputCol()[source]
Returns:The name of the output column
Return type:str
classmethod read()[source]

Returns an MLReader instance for this class.

setInputCol(value)[source]
Parameters:inputCol (str) – The name of the input column
setMap(value)[source]
Parameters:map (dict) – Map of substring match to replacement
setNormFunc(value)[source]
Parameters:normFunc (str) – Name of normalization function to apply
setOutputCol(value)[source]
Parameters:outputCol (str) – The name of the output column
setParams(inputCol=None, map=None, normFunc=None, outputCol=None)[source]

Set the (keyword only) parameters

Parameters:
  • inputCol (str) – The name of the input column
  • map (dict) – Map of substring match to replacement
  • normFunc (str) – Name of normalization function to apply
  • outputCol (str) – The name of the output column