Featurize¶
-
class
Featurize.Featurize(allowImages=False, featureColumns=None, numberOfFeatures=262144, oneHotEncodeCategoricals=True)[source]¶ Bases:
mmlspark.Utils.ComplexParamsMixin,pyspark.ml.util.JavaMLReadable,pyspark.ml.util.JavaMLWritable,pyspark.ml.wrapper.JavaEstimatorFeaturizes a dataset. Converts the specified columns to feature columns.
Parameters: -
getNumberOfFeatures()[source]¶ Returns: Number of features to hash string columns to (default: 262144) Return type: int
-
getOneHotEncodeCategoricals()[source]¶ Returns: One-hot encode categoricals (default: true) Return type: bool
-
setAllowImages(value)[source]¶ Parameters: allowImages (bool) – Allow featurization of images (default: false)
-
setNumberOfFeatures(value)[source]¶ Parameters: numberOfFeatures (int) – Number of features to hash string columns to (default: 262144)
-
setOneHotEncodeCategoricals(value)[source]¶ Parameters: oneHotEncodeCategoricals (bool) – One-hot encode categoricals (default: true)
-
-
class
Featurize.PipelineModel(java_model=None)[source]¶ Bases:
mmlspark.Utils.ComplexParamsMixin,pyspark.ml.wrapper.JavaModel,pyspark.ml.util.JavaMLWritable,pyspark.ml.util.JavaMLReadableModel fitted by
Featurize.This class is left empty on purpose. All necessary methods are exposed through inheritance.