UnicodeNormalize

class UnicodeNormalize.UnicodeNormalize(form=None, inputCol=None, lower=None, outputCol=None)[source]

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters:
  • form (str) – Unicode normalization form: NFC, NFD, NFKC, NFKD
  • inputCol (str) – The name of the input column
  • lower (bool) – Lowercase text
  • outputCol (str) – The name of the output column
getForm()[source]
Returns:Unicode normalization form: NFC, NFD, NFKC, NFKD
Return type:str
getInputCol()[source]
Returns:The name of the input column
Return type:str
static getJavaPackage()[source]

Returns package name String.

getLower()[source]
Returns:Lowercase text
Return type:bool
getOutputCol()[source]
Returns:The name of the output column
Return type:str
classmethod read()[source]

Returns an MLReader instance for this class.

setForm(value)[source]
Parameters:form (str) – Unicode normalization form: NFC, NFD, NFKC, NFKD
setInputCol(value)[source]
Parameters:inputCol (str) – The name of the input column
setLower(value)[source]
Parameters:lower (bool) – Lowercase text
setOutputCol(value)[source]
Parameters:outputCol (str) – The name of the output column
setParams(form=None, inputCol=None, lower=None, outputCol=None)[source]

Set the (keyword only) parameters

Parameters:
  • form (str) – Unicode normalization form: NFC, NFD, NFKC, NFKD
  • inputCol (str) – The name of the input column
  • lower (bool) – Lowercase text
  • outputCol (str) – The name of the output column