UnicodeNormalize¶

class UnicodeNormalize.UnicodeNormalize(form=None, inputCol=None, lower=None, outputCol=None)[source]¶

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters:	form (str) – Unicode normalization form: NFC, NFD, NFKC, NFKD inputCol (str) – The name of the input column lower (bool) – Lowercase text outputCol (str) – The name of the output column

getForm()[source]¶

Returns:	Unicode normalization form: NFC, NFD, NFKC, NFKD
Return type:	str

getInputCol()[source]¶

Returns:	The name of the input column
Return type:	str

getLower()[source]¶

Returns:	Lowercase text
Return type:	bool

getOutputCol()[source]¶

Returns:	The name of the output column
Return type:	str

classmethod read()[source]¶: Returns an MLReader instance for this class.

setForm(value)[source]¶

Parameters:	form (str) – Unicode normalization form: NFC, NFD, NFKC, NFKD

setInputCol(value)[source]¶

Parameters:	inputCol (str) – The name of the input column

setLower(value)[source]¶

Parameters:	lower (bool) – Lowercase text

setOutputCol(value)[source]¶

Parameters:	outputCol (str) – The name of the output column

setParams(form=None, inputCol=None, lower=None, outputCol=None)[source]¶

Set the (keyword only) parameters

Parameters:	form (str) – Unicode normalization form: NFC, NFD, NFKC, NFKD inputCol (str) – The name of the input column lower (bool) – Lowercase text outputCol (str) – The name of the output column