LightGBMRegressor

class LightGBMRegressor.LightGBMRegressor(alpha=0.9, application='regression', baggingFraction=1.0, baggingFreq=0, baggingSeed=3, defaultListenPort=12400, earlyStoppingRound=0, featureFraction=1.0, featuresCol='features', labelCol='label', learningRate=0.1, maxBin=255, maxDepth=-1, minSumHessianInLeaf=0.001, numIterations=100, numLeaves=31, parallelism='data_parallel', predictionCol='prediction', timeout=120.0)[source]

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaEstimator

Parameters:
  • alpha (double) – parameter for Huber loss and Quantile regression (default: 0.9)
  • application (str) – Regression application, regression_l2, regression_l1, huber, fair, poisson, quantile, mape, gamma or tweedie (default: regression)
  • baggingFraction (double) – Bagging fraction (default: 1.0)
  • baggingFreq (int) – Bagging frequence (default: 0)
  • baggingSeed (int) – Bagging seed (default: 3)
  • defaultListenPort (int) – The default listen port on executors, used for testing (default: 12400)
  • earlyStoppingRound (int) – Early stopping round (default: 0)
  • featureFraction (double) – Feature fraction (default: 1.0)
  • featuresCol (str) – features column name (default: features)
  • labelCol (str) – label column name (default: label)
  • learningRate (double) – Learning rate or shrinkage rate (default: 0.1)
  • maxBin (int) – Max bin (default: 255)
  • maxDepth (int) – Max depth (default: -1)
  • minSumHessianInLeaf (double) – Minimal sum hessian in one leaf (default: 0.001)
  • numIterations (int) – Number of iterations, LightGBM constructs num_class * num_iterations trees (default: 100)
  • numLeaves (int) – Number of leaves (default: 31)
  • parallelism (str) – Tree learner parallelism, can be set to data_parallel or voting_parallel (default: data_parallel)
  • predictionCol (str) – prediction column name (default: prediction)
  • timeout (double) – Timeout in seconds (default: 120.0)
getAlpha()[source]
Returns:parameter for Huber loss and Quantile regression (default: 0.9)
Return type:double
getApplication()[source]
Returns:Regression application, regression_l2, regression_l1, huber, fair, poisson, quantile, mape, gamma or tweedie (default: regression)
Return type:str
getBaggingFraction()[source]
Returns:Bagging fraction (default: 1.0)
Return type:double
getBaggingFreq()[source]
Returns:Bagging frequence (default: 0)
Return type:int
getBaggingSeed()[source]
Returns:Bagging seed (default: 3)
Return type:int
getDefaultListenPort()[source]
Returns:The default listen port on executors, used for testing (default: 12400)
Return type:int
getEarlyStoppingRound()[source]
Returns:Early stopping round (default: 0)
Return type:int
getFeatureFraction()[source]
Returns:Feature fraction (default: 1.0)
Return type:double
getFeaturesCol()[source]
Returns:features column name (default: features)
Return type:str
static getJavaPackage()[source]

Returns package name String.

getLabelCol()[source]
Returns:label column name (default: label)
Return type:str
getLearningRate()[source]
Returns:Learning rate or shrinkage rate (default: 0.1)
Return type:double
getMaxBin()[source]
Returns:Max bin (default: 255)
Return type:int
getMaxDepth()[source]
Returns:Max depth (default: -1)
Return type:int
getMinSumHessianInLeaf()[source]
Returns:Minimal sum hessian in one leaf (default: 0.001)
Return type:double
getNumIterations()[source]
Returns:Number of iterations, LightGBM constructs num_class * num_iterations trees (default: 100)
Return type:int
getNumLeaves()[source]
Returns:Number of leaves (default: 31)
Return type:int
getParallelism()[source]
Returns:Tree learner parallelism, can be set to data_parallel or voting_parallel (default: data_parallel)
Return type:str
getPredictionCol()[source]
Returns:prediction column name (default: prediction)
Return type:str
getTimeout()[source]
Returns:Timeout in seconds (default: 120.0)
Return type:double
classmethod read()[source]

Returns an MLReader instance for this class.

setAlpha(value)[source]
Parameters:alpha (double) – parameter for Huber loss and Quantile regression (default: 0.9)
setApplication(value)[source]
Parameters:application (str) – Regression application, regression_l2, regression_l1, huber, fair, poisson, quantile, mape, gamma or tweedie (default: regression)
setBaggingFraction(value)[source]
Parameters:baggingFraction (double) – Bagging fraction (default: 1.0)
setBaggingFreq(value)[source]
Parameters:baggingFreq (int) – Bagging frequence (default: 0)
setBaggingSeed(value)[source]
Parameters:baggingSeed (int) – Bagging seed (default: 3)
setDefaultListenPort(value)[source]
Parameters:defaultListenPort (int) – The default listen port on executors, used for testing (default: 12400)
setEarlyStoppingRound(value)[source]
Parameters:earlyStoppingRound (int) – Early stopping round (default: 0)
setFeatureFraction(value)[source]
Parameters:featureFraction (double) – Feature fraction (default: 1.0)
setFeaturesCol(value)[source]
Parameters:featuresCol (str) – features column name (default: features)
setLabelCol(value)[source]
Parameters:labelCol (str) – label column name (default: label)
setLearningRate(value)[source]
Parameters:learningRate (double) – Learning rate or shrinkage rate (default: 0.1)
setMaxBin(value)[source]
Parameters:maxBin (int) – Max bin (default: 255)
setMaxDepth(value)[source]
Parameters:maxDepth (int) – Max depth (default: -1)
setMinSumHessianInLeaf(value)[source]
Parameters:minSumHessianInLeaf (double) – Minimal sum hessian in one leaf (default: 0.001)
setNumIterations(value)[source]
Parameters:numIterations (int) – Number of iterations, LightGBM constructs num_class * num_iterations trees (default: 100)
setNumLeaves(value)[source]
Parameters:numLeaves (int) – Number of leaves (default: 31)
setParallelism(value)[source]
Parameters:parallelism (str) – Tree learner parallelism, can be set to data_parallel or voting_parallel (default: data_parallel)
setParams(alpha=0.9, application='regression', baggingFraction=1.0, baggingFreq=0, baggingSeed=3, defaultListenPort=12400, earlyStoppingRound=0, featureFraction=1.0, featuresCol='features', labelCol='label', learningRate=0.1, maxBin=255, maxDepth=-1, minSumHessianInLeaf=0.001, numIterations=100, numLeaves=31, parallelism='data_parallel', predictionCol='prediction', timeout=120.0)[source]

Set the (keyword only) parameters

Parameters:
  • alpha (double) – parameter for Huber loss and Quantile regression (default: 0.9)
  • application (str) – Regression application, regression_l2, regression_l1, huber, fair, poisson, quantile, mape, gamma or tweedie (default: regression)
  • baggingFraction (double) – Bagging fraction (default: 1.0)
  • baggingFreq (int) – Bagging frequence (default: 0)
  • baggingSeed (int) – Bagging seed (default: 3)
  • defaultListenPort (int) – The default listen port on executors, used for testing (default: 12400)
  • earlyStoppingRound (int) – Early stopping round (default: 0)
  • featureFraction (double) – Feature fraction (default: 1.0)
  • featuresCol (str) – features column name (default: features)
  • labelCol (str) – label column name (default: label)
  • learningRate (double) – Learning rate or shrinkage rate (default: 0.1)
  • maxBin (int) – Max bin (default: 255)
  • maxDepth (int) – Max depth (default: -1)
  • minSumHessianInLeaf (double) – Minimal sum hessian in one leaf (default: 0.001)
  • numIterations (int) – Number of iterations, LightGBM constructs num_class * num_iterations trees (default: 100)
  • numLeaves (int) – Number of leaves (default: 31)
  • parallelism (str) – Tree learner parallelism, can be set to data_parallel or voting_parallel (default: data_parallel)
  • predictionCol (str) – prediction column name (default: prediction)
  • timeout (double) – Timeout in seconds (default: 120.0)
setPredictionCol(value)[source]
Parameters:predictionCol (str) – prediction column name (default: prediction)
setTimeout(value)[source]
Parameters:timeout (double) – Timeout in seconds (default: 120.0)
class LightGBMRegressor.M(java_model=None)[source]

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.wrapper.JavaModel, pyspark.ml.util.JavaMLWritable, pyspark.ml.util.JavaMLReadable

Model fitted by LightGBMRegressor.

This class is left empty on purpose. All necessary methods are exposed through inheritance.

static getJavaPackage()[source]

Returns package name String.

classmethod read()[source]

Returns an MLReader instance for this class.