FastVectorAssembler

class FastVectorAssembler.FastVectorAssembler(inputCols=None, outputCol=None)[source]

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

A fast vector assembler. The columns given must be ordered such that categorical columns come first. Otherwise, Spark learners will give categorical attributes to the wrong index. The assembler does not keep spurious numeric data which can significantly slow down computations when there are millions of columns.

To use this FastVectorAssemble you must import the org.apache.spark.ml.feature package.

Parameters:
  • inputCols (list) – input column names
  • outputCol (str) – output column name (default: [self.uid]__output)
getInputCols()[source]
Returns:input column names
Return type:list
static getJavaPackage()[source]

Returns package name String.

getOutputCol()[source]
Returns:output column name (default: [self.uid]__output)
Return type:str
classmethod read()[source]

Returns an MLReader instance for this class.

setInputCols(value)[source]
Parameters:inputCols (list) – input column names
setOutputCol(value)[source]
Parameters:outputCol (str) – output column name (default: [self.uid]__output)
setParams(inputCols=None, outputCol=None)[source]

Set the (keyword only) parameters

Parameters:
  • inputCols (list) – input column names
  • outputCol (str) – output column name (default: [self.uid]__output)