Featurize¶
-
class
Featurize.
Featurize
(allowImages=False, featureColumns=None, numberOfFeatures=262144, oneHotEncodeCategoricals=True)[source]¶ Bases:
mmlspark.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaEstimator
Featurizes a dataset. Converts the specified columns to feature columns.
Parameters: -
getNumberOfFeatures
()[source]¶ Returns: Number of features to hash string columns to (default: 262144) Return type: int
-
getOneHotEncodeCategoricals
()[source]¶ Returns: One-hot encode categoricals (default: true) Return type: bool
-
setAllowImages
(value)[source]¶ Parameters: allowImages (bool) – Allow featurization of images (default: false)
-
setNumberOfFeatures
(value)[source]¶ Parameters: numberOfFeatures (int) – Number of features to hash string columns to (default: 262144)
-
setOneHotEncodeCategoricals
(value)[source]¶ Parameters: oneHotEncodeCategoricals (bool) – One-hot encode categoricals (default: true)
-
-
class
Featurize.
PipelineModel
(java_model=None)[source]¶ Bases:
mmlspark.Utils.ComplexParamsMixin
,pyspark.ml.wrapper.JavaModel
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.util.JavaMLReadable
Model fitted by
Featurize
.This class is left empty on purpose. All necessary methods are exposed through inheritance.