FastVectorAssembler¶

class FastVectorAssembler.FastVectorAssembler(inputCols=None, outputCol=None)[source]¶

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

A fast vector assembler. The columns given must be ordered such that categorical columns come first. Otherwise, Spark learners will give categorical attributes to the wrong index. The assembler does not keep spurious numeric data which can significantly slow down computations when there are millions of columns.

To use this FastVectorAssemble you must import the org.apache.spark.ml.feature package.

Parameters:	inputCols (list) – input column names outputCol (str) – output column name (default: [self.uid]__output)

getInputCols()[source]¶

Returns:	input column names
Return type:	list

static getJavaPackage()[source]¶: Returns package name String.

getOutputCol()[source]¶

Returns:	output column name (default: [self.uid]__output)
Return type:	str

classmethod read()[source]¶: Returns an MLReader instance for this class.

setInputCols(value)[source]¶

Parameters:	inputCols (list) – input column names

setOutputCol(value)[source]¶

Parameters:	outputCol (str) – output column name (default: [self.uid]__output)

setParams(inputCols=None, outputCol=None)[source]¶

Set the (keyword only) parameters

Parameters:	inputCols (list) – input column names outputCol (str) – output column name (default: [self.uid]__output)