Package

com.microsoft.ml

spark

Permalink

package spark

Microsoft Machine Learning on Apache Spark (MMLSpark)

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. spark
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. class AssembleFeatures extends Estimator[AssembleFeaturesModel] with HasFeaturesCol with MMLParams

    Permalink

    Creates a vector column of features from a collection of feature columns

  2. class AssembleFeaturesModel extends Model[AssembleFeaturesModel] with Params with ConstructorWritable[AssembleFeaturesModel]

    Permalink

    Model produced by AssembleFeatures.

  3. abstract class AsyncClient extends BaseClient

    Permalink
  4. abstract class AsyncHTTPClient extends AsyncClient with HTTPClient

    Permalink
  5. class BasicDatasetGenerationConstraints extends HasDatasetGenerationConstraints

    Permalink

    Basic constraints for generating a dataset.

  6. class BatchIterator[T] extends Iterator[List[T]]

    Permalink
  7. case class Benchmark(name: String, value: Double, precision: Double, higherIsBetter: Boolean = true) extends Product with Serializable

    Permalink
  8. abstract class Benchmarks extends TestBase

    Permalink
  9. class BestModel extends Model[BestModel] with ConstructorWritable[BestModel]

    Permalink

    Model produced by FindBestModel.

  10. class Blur extends ImageTransformerStage

    Permalink

    Blurs the image using a box filter.

    Blurs the image using a box filter. The params are a map of the dimensions of the blurring box. Please refer to OpenCV for more information.

  11. class BrainScriptBuilder extends AnyRef

    Permalink

    Utility methods for manipulating the BrainScript and overrides configs output to disk.

  12. case class BrainScriptConfig(name: String, text: Seq[String]) extends Product with Serializable

    Permalink
  13. class CNTKCommandBuilder extends CNTKCommandBuilderBase

    Permalink
  14. abstract class CNTKCommandBuilderBase extends AnyRef

    Permalink
  15. class CNTKConfig extends MMLConfig

    Permalink
  16. class CNTKFunctionParam extends ComplexParam[SerializableFunction]

    Permalink

    Param for ByteArray.

    Param for ByteArray. Needed as spark has explicit params for many different types but not ByteArray.

  17. class CNTKLearner extends Estimator[CNTKModel] with CNTKParams

    Permalink
    Annotations
    @InternalWrapper()
  18. class CNTKModel extends Model[CNTKModel] with ComplexParamsWritable with HasMiniBatcher with Wrappable

    Permalink
    Annotations
    @InternalWrapper()
  19. trait CNTKParams extends MMLParams

    Permalink
  20. class Cacher extends Transformer with DefaultParamsWritable

    Permalink
  21. class CheckpointData extends Transformer with CheckpointDataParams

    Permalink

    Cache the dataset at this point to memory or memory and disk

  22. trait CheckpointDataParams extends MMLParams

    Permalink
  23. class ClassBalancer extends Estimator[ClassBalancerModel] with DefaultParamsWritable with HasInputCol with HasOutputCol

    Permalink

    An estimator that calculates the weights for balancing a dataset.

    An estimator that calculates the weights for balancing a dataset. For example, if the negative class is half the size of the positive class, the weights will be 2 for rows with negative classes and 1 for rows with positive classes. these weights can be used in weighted classifiers and regressors to correct for heavilty skewed datasets. The inputCol should be the labels of the classes, and the output col will be the requisite weights.

  24. class ClassBalancerModel extends Model[ClassBalancerModel] with ConstructorWritable[ClassBalancerModel]

    Permalink
  25. case class ClassifierTrainParams(parallelism: String, numIterations: Int, learningRate: Double, numLeaves: Int, maxBin: Int, baggingFraction: Double, baggingFreq: Int, baggingSeed: Int, earlyStoppingRound: Int, featureFraction: Double, maxDepth: Int, minSumHessianInLeaf: Double, numMachines: Int, objective: String) extends TrainParams with Product with Serializable

    Permalink

    Defines the Booster parameters passed to the LightGBM classifier.

  26. class CleanMissingData extends Estimator[CleanMissingDataModel] with HasInputCols with HasOutputCols with MMLParams

    Permalink

    Removes missing values from input dataset.

    Removes missing values from input dataset. The following modes are supported: Mean - replaces missings with mean of fit column Median - replaces missings with approximate median of fit column Custom - replaces missings with custom value specified by user For mean and median modes, only numeric column types are supported, specifically: Int, Long, Float, Double For custom mode, the types above are supported and additionally: String, Boolean

  27. class CleanMissingDataModel extends Model[CleanMissingDataModel] with ConstructorWritable[CleanMissingDataModel]

    Permalink

    Model produced by CleanMissingData.

  28. class ColorFormat extends ImageTransformerStage

    Permalink

    Converts an image from one color space to another, eg COLOR_BGR2GRAY.

    Converts an image from one color space to another, eg COLOR_BGR2GRAY. Refer to OpenCV for more information.

  29. class ColumnNamesToFeaturize extends Serializable

    Permalink

    Class containing the list of column names to perform special featurization steps for.

    Class containing the list of column names to perform special featurization steps for. colNamesToHash - List of column names to hash. colNamesToDuplicateForMissings - List of column names containing doubles to duplicate so we can remove missing values from them. colNamesToTypes - Map of column names to their types. colNamesToCleanMissings - List of column names to clean missing values from (ignore). colNamesToVectorize - List of column names to vectorize using FastVectorAssembler. categoricalColumns - List of categorical columns to pass through or turn into indicator array. conversionColumnNamesMap - Map from old column names to new. addedColumnNamesMap - Map from old columns to newly generated columns for featurization.

    Annotations
    @SerialVersionUID()
  30. class ComputeModelStatistics extends Transformer with ComputeModelStatisticsParams

    Permalink

    Evaluates the given scored dataset.

  31. trait ComputeModelStatisticsParams extends MMLParams with HasLabelCol with HasScoresCol with HasScoredLabelsCol with HasEvaluationMetric

    Permalink
  32. class ComputePerInstanceStatistics extends Transformer with ComputePerInstanceStatisticsParams

    Permalink

    Evaluates the given scored dataset with per instance metrics.

    Evaluates the given scored dataset with per instance metrics.

    The Regression metrics are: - L1_loss - L2_loss

    The Classification metrics are: - log_loss

  33. trait ComputePerInstanceStatisticsParams extends MMLParams with HasLabelCol with HasScoresCol with HasScoredLabelsCol with HasScoredProbabilitiesCol with HasEvaluationMetric

    Permalink
  34. abstract class Configuration extends AnyRef

    Permalink
  35. class Consolidator[T] extends AnyRef

    Permalink
  36. trait ConstructorReadable[T <: ConstructorWritable[_]] extends MLReadable[T]

    Permalink
  37. class ConstructorReader[T] extends MLReader[T]

    Permalink
  38. trait ConstructorWritable[S <: PipelineStage] extends MLWritable

    Permalink

    This trait allows you to easily add serialization to your Spark Models, assuming that they are completely parameterized by their constructor.

    This trait allows you to easily add serialization to your Spark Models, assuming that they are completely parameterized by their constructor. The main two fields required ate the TypeTag that allows the writer to inspect the constructor to get the types that need to be serialized, the actual objects that are serialized need to be defined in the field objectsToSave.

  39. abstract class ConstructorWriter[S <: PipelineStage] extends MLWriter

    Permalink
  40. class ContextObjectInputStream extends ObjectInputStream

    Permalink
  41. class CropImage extends ImageTransformerStage

    Permalink

    Crops the image for processing.

    Crops the image for processing. The parameters are: "x" - First dimension; start of crop "y" - second dimension - start of crop "height" -height of cropped image "width" - width of cropped image "stageName" - "crop"

  42. class CustomInputParser extends HTTPInputParser with ComplexParamsWritable

    Permalink
  43. class CustomOutputParser extends HTTPOutputParser with ComplexParamsWritable

    Permalink
  44. class DataConversion extends Transformer with MMLParams

    Permalink

    Converts the specified list of columns to the specified type.

    Converts the specified list of columns to the specified type. Returns a new DataFrame with the converted columns

  45. trait DataFrameEquality extends TestBase

    Permalink
  46. class DataFrameSugars extends AnyRef

    Permalink
  47. case class DataStreamReaderExtensions(dsr: DataStreamReader) extends Product with Serializable

    Permalink
  48. case class DataStreamWriterExtensions[T](dsw: DataStreamWriter[T]) extends Product with Serializable

    Permalink
  49. abstract class DataWriter extends AnyRef

    Permalink
  50. case class DatasetMissingValuesGenerationOptions(percentMissing: Double, columnTypesWithMissings: ColumnOptions.ValueSet, dataTypesWithMissings: DataOptions.ValueSet) extends Product with Serializable

    Permalink
  51. case class DatasetOptions(columnTypes: ColumnOptions.ValueSet, dataTypes: DataOptions.ValueSet, missingValuesOptions: DatasetMissingValuesGenerationOptions) extends Product with Serializable

    Permalink

    Options used to specify how a dataset will be generated.

    Options used to specify how a dataset will be generated. This contains information on what the data and column types (specified as flags) for generating a dataset will be limited to. It also contain options for all possible missing values generation and options for how values will be generated.

  52. class DiscreteHyperParam[T] extends Dist[T]

    Permalink
  53. abstract class Dist[T] extends AnyRef

    Permalink

    Represents a distribution of values.

    Represents a distribution of values.

    T

    The type T of the values generated.

  54. class DoubleRangeHyperParam extends RangeHyperParam[Double]

    Permalink
  55. class DropColumns extends Transformer with MMLParams

    Permalink

    DropColumns takes a dataframe and a list of columns to drop as input and returns a dataframe comprised of only those columns not listed in the input list.

    DropColumns takes a dataframe and a list of columns to drop as input and returns a dataframe comprised of only those columns not listed in the input list.

  56. class DynamicBufferedBatcher[T] extends Iterator[List[T]]

    Permalink
  57. class DynamicMiniBatchTransformer extends Transformer with MiniBatchBase

    Permalink
  58. class EnsembleByKey extends Transformer with MMLParams

    Permalink
  59. case class EntityData(content: Array[Byte], contentEncoding: Option[HeaderData], contentLength: Long, contentType: Option[HeaderData], isChunked: Boolean, isRepeatable: Boolean, isStreaming: Boolean) extends Product with Serializable

    Permalink
  60. class Explode extends Transformer with HasInputCol with HasOutputCol with MMLParams

    Permalink
  61. class Featurize extends Estimator[PipelineModel] with MMLParams

    Permalink

    Featurizes a dataset.

    Featurizes a dataset. Converts the specified columns to feature columns.

  62. class FindBestModel extends Estimator[BestModel] with FindBestModelParams

    Permalink

    Evaluates and chooses the best model from a list of models.

  63. trait FindBestModelParams extends Wrappable with ComplexParamsWritable with HasEvaluationMetric

    Permalink
  64. class FixedBatcher[T] extends Iterator[List[T]]

    Permalink
  65. class FixedBufferedBatcher[T] extends Iterator[List[T]]

    Permalink
  66. class FixedMiniBatchTransformer extends Transformer with MiniBatchBase

    Permalink
  67. class FlattenBatch extends Transformer with Wrappable with DefaultParamsWritable

    Permalink
  68. class Flip extends ImageTransformerStage

    Permalink

    Flips the image

  69. class FloatRangeHyperParam extends RangeHyperParam[Float]

    Permalink
  70. class GaussianKernel extends ImageTransformerStage

    Permalink

    Applies gaussian kernel to blur the image.

    Applies gaussian kernel to blur the image. Please refer to OpenCV for detailed information about the parameters and their allowable values.

  71. class GenerateDataType extends Serializable

    Permalink

    Generates the specified random data type.

  72. class GridSpace extends ParamSpace

    Permalink

    Represents a parameter grid for tuning with discrete values.

    Represents a parameter grid for tuning with discrete values. Can be generated with the ParamGridBuilder.

  73. abstract class HTTPInputParser extends Transformer with HasOutputCol with HasInputCol

    Permalink
  74. abstract class HTTPOutputParser extends Transformer with HasInputCol with HasOutputCol

    Permalink
  75. trait HTTPParams extends Wrappable

    Permalink
  76. case class HTTPRequestData(requestLine: RequestLineData, headers: Array[HeaderData], entity: Option[EntityData]) extends Product with Serializable

    Permalink
  77. case class HTTPResponseData(headers: Array[HeaderData], entity: EntityData, statusLine: StatusLineData, locale: String) extends Product with Serializable

    Permalink
  78. class HTTPTransformer extends Transformer with HTTPParams with HasInputCol with HasOutputCol with ComplexParamsWritable

    Permalink
  79. trait Handler extends AnyRef

    Permalink
  80. trait HandlingUtils extends AnyRef

    Permalink
  81. trait HasDatasetGenerationConstraints extends AnyRef

    Permalink

    Specifies the trait for constraints on generating a dataset.

  82. trait HasErrorCol extends Params

    Permalink
  83. trait HasEvaluationMetric extends Wrappable

    Permalink
  84. trait HasFeaturesCol extends Wrappable

    Permalink
  85. trait HasInputCol extends Wrappable

    Permalink
  86. trait HasInputCols extends Wrappable

    Permalink
  87. trait HasLabelCol extends Wrappable

    Permalink
  88. trait HasMiniBatcher extends Params

    Permalink
  89. trait HasOutputCol extends Wrappable

    Permalink
  90. trait HasOutputCols extends Wrappable

    Permalink
  91. trait HasScoredLabelsCol extends Wrappable

    Permalink
  92. trait HasScoredProbabilitiesCol extends Wrappable

    Permalink
  93. trait HasScoresCol extends Wrappable

    Permalink
  94. trait HasURL extends Wrappable

    Permalink
  95. class HdfsWriter extends SingleFileResolver

    Permalink
  96. case class HeaderData(name: String, value: String) extends Product with Serializable

    Permalink
  97. class HyperparamBuilder extends AnyRef

    Permalink

    Specifies the search space for hyperparameters.

  98. class ImageFeaturizer extends Transformer with HasInputCol with HasOutputCol with Wrappable with ComplexParamsWritable

    Permalink

    The ImageFeaturizer relies on a CNTK model to do the featurization, one can set this model using the modelLocation parameter.

    The ImageFeaturizer relies on a CNTK model to do the featurization, one can set this model using the modelLocation parameter. To map the nodes of the CNTK model onto the standard "layers" structure of a feed forward neural net, one needs to supply a list of node names that range from the output node, back towards the input node of the CNTK Function. This list does not need to be exhaustive, and is provided to you if you use a model downloaded from the ModelDownloader, one can find this layer list in the schema of the downloaded model.

    The ImageFeaturizer takes an input column of images (the type returned by the ImageReader), and automatically resizes them to fit the CMTKModel's inputs. It then feeds them through a pre-trained CNTK model. One can truncate the model using the cutOutputLayers parameter that determines how many layers to truncate from the output of the network. For example, layer=0 means that no layers are removed, layer=2 means that the image featurizer returns the activations of the layer that is two layers from the output layer.

    Annotations
    @InternalWrapper()
  99. class ImageSetAugmenter extends Transformer with HasInputCol with HasOutputCol with DefaultParamsWritable

    Permalink
  100. class ImageTransformer extends Transformer with HasInputCol with HasOutputCol with MMLParams

    Permalink

    Image processing stage.

    Image processing stage. Please refer to OpenCV for additional information

    Annotations
    @InternalWrapper()
  101. abstract class ImageTransformerStage extends Serializable

    Permalink

    Image processing stage.

  102. class IndexToValue extends Transformer with HasInputCol with HasOutputCol with MMLParams

    Permalink

    This class takes in a categorical column with MML style attibutes and then transforms it back to the original values.

    This class takes in a categorical column with MML style attibutes and then transforms it back to the original values. This extends MLLIB IndexToString by allowing the transformation back to any types of values.

  103. case class InputData(format: String, path: String, shapes: Map[String, InputShape]) extends Product with Serializable

    Permalink
  104. case class InputShape(dim: Int, form: String) extends Product with Serializable

    Permalink
  105. class IntRangeHyperParam extends RangeHyperParam[Int]

    Permalink
  106. class InternalWrapper extends Annotation with StaticAnnotation

    Permalink

    Generate the internal wrapper for a given class.

    Generate the internal wrapper for a given class. Used for complicated wrappers, where the basic functionality is auto-generated, and the rest is added in the inherited wrapper.

  107. class JSONInputParser extends HTTPInputParser with HasURL with ComplexParamsWritable

    Permalink
  108. class JSONOutputParser extends HTTPOutputParser with ComplexParamsWritable

    Permalink
  109. class Lambda extends Transformer with Wrappable with ComplexParamsWritable

    Permalink
  110. class LightGBMBooster extends Serializable

    Permalink

    Represents a LightGBM Booster learner

  111. class LightGBMClassificationModel extends ProbabilisticClassificationModel[Vector, LightGBMClassificationModel] with ConstructorWritable[LightGBMClassificationModel]

    Permalink

    Model produced by LightGBMClassifier.

    Model produced by LightGBMClassifier.

    Annotations
    @InternalWrapper()
  112. class LightGBMClassifier extends ProbabilisticClassifier[Vector, LightGBMClassifier, LightGBMClassificationModel] with LightGBMParams

    Permalink

    Trains a LightGBM Binary Classification model, a fast, distributed, high performance gradient boosting framework based on decision tree algorithms.

    Trains a LightGBM Binary Classification model, a fast, distributed, high performance gradient boosting framework based on decision tree algorithms. For more information please see here: https://github.com/Microsoft/LightGBM. For parameter information see here: https://github.com/Microsoft/LightGBM/blob/master/docs/Parameters.rst

    Annotations
    @InternalWrapper()
  113. trait LightGBMParams extends MMLParams

    Permalink

    Defines common parameters across all LightGBM learners.

  114. class LightGBMRegressionModel extends RegressionModel[Vector, LightGBMRegressionModel] with ConstructorWritable[LightGBMRegressionModel]

    Permalink

    Model produced by LightGBMRegressor.

  115. class LightGBMRegressor extends BaseRegressor[Vector, LightGBMRegressor, LightGBMRegressionModel] with LightGBMParams

    Permalink

    Trains a LightGBM Regression model, a fast, distributed, high performance gradient boosting framework based on decision tree algorithms.

    Trains a LightGBM Regression model, a fast, distributed, high performance gradient boosting framework based on decision tree algorithms. For more information please see here: https://github.com/Microsoft/LightGBM. For parameter information see here: https://github.com/Microsoft/LightGBM/blob/master/docs/Parameters.rst Note: The application parameter supports the following values:

    • regression_l2, L2 loss, alias=regression, mean_squared_error, mse, l2_root, root_mean_squared_error, rmse
    • regression_l1, L1 loss, alias=mean_absolute_error, mae
    • huber, Huber loss
    • fair, Fair loss
    • poisson, Poisson regression
    • quantile, Quantile regression
    • mape, MAPE loss, alias=mean_absolute_percentage_error
    • gamma, Gamma regression with log-link. It might be useful, e.g., for modeling insurance claims severity, or for any target that might be gamma-distributed
    • tweedie, Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any target that might be tweedie-distributed
  116. trait LinuxOnly extends TestBase

    Permalink
  117. class LocalWriter extends SingleFileResolver

    Permalink
  118. class LongRangeHyperParam extends RangeHyperParam[Long]

    Permalink
  119. class MMLConfig extends Configuration

    Permalink
  120. trait MMLParams extends Wrappable with DefaultParamsWritable

    Permalink
  121. class MPICommandBuilder extends CNTKCommandBuilderBase with MPIConfiguration

    Permalink
  122. trait MPIConfiguration extends AnyRef

    Permalink
  123. class MetricsLogger extends AnyRef

    Permalink

    Helper class for logging metrics to log4j.

  124. trait MiniBatchBase extends Transformer with DefaultParamsWritable with Wrappable

    Permalink
  125. class ModelDownloader extends Client

    Permalink

    Class for downloading models from a server to Local or HDFS

  126. class ModelNotFoundException extends FileNotFoundException

    Permalink

    Exception returned if a repo cannot find the file

  127. case class ModelSchema(name: String, dataset: String, modelType: String, uri: URI, hash: String, size: Long, inputNode: Int, numLayers: Int, layerNames: Array[String]) extends Schema with Product with Serializable

    Permalink

    Class representing the schema of a CNTK model

    Class representing the schema of a CNTK model

    name

    name of the model architecture

    dataset

    dataset the model was trained on

    modelType

    type of problem the model is suited for eg: (image, text, sound, sentiment etc)

    uri

    location of the underlying file (local, HDFS, or HTTP)

    hash

    sha256 hash of the underlying file

    size

    size in bytes of the underlying file

    inputNode

    the node which represents the input

    numLayers

    the number of layers of the model

    layerNames

    the names nodes that represent layers in the network

  128. class MultiColumnAdapter extends Estimator[PipelineModel] with Wrappable with ComplexParamsWritable

    Permalink

    The MultiColumnAdapter takes a unary pipeline stage and a list of input output column pairs and applies the pipeline stage to each input column after being fit

    The MultiColumnAdapter takes a unary pipeline stage and a list of input output column pairs and applies the pipeline stage to each input column after being fit

  129. class MultiNGram extends Transformer with HasInputCol with HasOutputCol with MMLParams

    Permalink

    Extracts several ngrams

  130. class MultiVectorAssembler extends VectorAssembler

    Permalink
  131. class NativeLoader extends Serializable

    Permalink
  132. case class NetworkParams(executorIdToHost: Map[Int, String], defaultListenPort: Int, addr: String, port: Int) extends Product with Serializable

    Permalink
  133. class NullOrdering[T] extends Ordering[T]

    Permalink
  134. class PartitionConsolidator extends Transformer with HTTPParams with HasInputCol with HasOutputCol with ComplexParamsWritable

    Permalink
  135. sealed class PartitionSample extends Transformer with PartitionSampleParams

    Permalink

  136. trait PartitionSampleParams extends MMLParams

    Permalink
  137. case class ProtocolVersionData(protocol: String, major: Int, minor: Int) extends Product with Serializable

    Permalink
  138. class RandomDatasetGenerationConstraints extends HasDatasetGenerationConstraints

    Permalink

    Contraints on generating a dataset where all parameters are randomly generated.

  139. abstract class RandomMMLGenerator[T] extends RandomDataGenerator[T]

    Permalink

    Base abstract class for random generation of data.

    Base abstract class for random generation of data.

    T

    The data to generate.

  140. class RandomRowGenerator extends RandomMMLGenerator[Row]

    Permalink

    Randomly generates a row given the set space of data, column options.

  141. class RandomRowGeneratorCombiner extends RandomMMLGenerator[Row]

    Permalink

    Combines an array of row generators into a single row generator.

  142. class RandomSpace extends ParamSpace

    Permalink

    Represents a generator of parameters with specified distributions added by the HyperparamBuilder.

  143. abstract class RangeHyperParam[T] extends Dist[T]

    Permalink
  144. case class RegressorTrainParams(parallelism: String, numIterations: Int, learningRate: Double, numLeaves: Int, objective: String, alpha: Double, tweedieVariancePower: Double, maxBin: Int, baggingFraction: Double, baggingFreq: Int, baggingSeed: Int, earlyStoppingRound: Int, featureFraction: Double, maxDepth: Int, minSumHessianInLeaf: Double, numMachines: Int) extends TrainParams with Product with Serializable

    Permalink

    Defines the Booster parameters passed to the LightGBM regressor.

  145. class RenameColumn extends Transformer with MMLParams with HasInputCol with HasOutputCol

    Permalink

    RenameColumn takes a dataframe with an input and an output column name and returns a dataframe comprised of the original columns with the input column renamed as the output column name.

    RenameColumn takes a dataframe with an input and an output column name and returns a dataframe comprised of the original columns with the input column renamed as the output column name.

  146. class Repartition extends Transformer with MMLParams

    Permalink

    Partitions the dataset into n partitions

  147. case class RequestLineData(method: String, uri: String, protoclVersion: Option[ProtocolVersionData]) extends Product with Serializable

    Permalink
  148. class ResizeImage extends ImageTransformerStage

    Permalink

    Resizes the image.

    Resizes the image. The parameters of the ParameterMap are: "height" - the height of the image "width" "stageName" Please refer to OpenCV for more information

  149. abstract class Schema extends AnyRef

    Permalink

    Abstract representation of a schema for an item that can be held in a repository

  150. class SelectColumns extends Transformer with MMLParams

    Permalink

    SelectColumns takes a dataframe and a list of columns to select as input and returns a dataframe comprised of only those columns listed in the input list.

    SelectColumns takes a dataframe and a list of columns to select as input and returns a dataframe comprised of only those columns listed in the input list.

    The columns to be selected is a list of column names

  151. class SharedSingleton[T] extends Serializable

    Permalink

    Holds a variable shared among all workers that behaves like a local singleton.

    Holds a variable shared among all workers that behaves like a local singleton. Useful to use non-serializable objects in Spark closures that maintain state across tasks.

  152. class SharedVariable[T] extends Serializable

    Permalink

    Holds a variable shared among all workers.

    Holds a variable shared among all workers. Useful to use non-serializable objects in Spark closures.

    Note this code has been borrowed from: https://www.nicolaferraro.me/2016/02/22/using-non-serializable-objects-in-apache-spark/

  153. class SimpleHTTPTransformer extends Transformer with HTTPParams with HasMiniBatcher with HasInputCol with HasOutputCol with ComplexParamsWritable with HasErrorCol

    Permalink
  154. abstract class SingleFileResolver extends DataWriter

    Permalink
  155. abstract class SingleThreadedHTTPClient extends HTTPClient with SingleThreadedClient

    Permalink
  156. abstract class SingleTypeReducer extends Transformer with TypeConversion

    Permalink
  157. class SingleVectorAssembler extends VectorAssembler

    Permalink
  158. case class StatusLineData(protocolVersion: ProtocolVersionData, statusCode: Int, reasonPhrase: String) extends Product with Serializable

    Permalink
  159. class SummarizeData extends Transformer with SummarizeDataParams

    Permalink

    Compute summary statistics for the dataset.

    Compute summary statistics for the dataset. The following statistics are computed: - counts - basic - sample - percentiles - errorThreshold - error threshold for quantiles

  160. trait SummarizeDataParams extends MMLParams

    Permalink
  161. class TLCConfig extends MMLConfig

    Permalink
  162. abstract class TestBase extends FunSuite with BeforeAndAfterEachTestData with BeforeAndAfterAll

    Permalink
  163. class TextFeaturizer extends Estimator[TextFeaturizerModel] with TextFeaturizerParams with HasInputCol with HasOutputCol

    Permalink

    Featurize text.

  164. class TextFeaturizerModel extends Model[TextFeaturizerModel] with ConstructorWritable[TextFeaturizerModel]

    Permalink
  165. trait TextFeaturizerParams extends Wrappable with DefaultParamsWritable

    Permalink
  166. class TextPreprocessor extends Transformer with HasInputCol with HasOutputCol with Wrappable with ComplexParamsWritable

    Permalink

    TextPreprocessor takes a dataframe and a dictionary that maps (text -> replacement text), scans each cell in the input col and replaces all substring matches with the corresponding value.

    TextPreprocessor takes a dataframe and a dictionary that maps (text -> replacement text), scans each cell in the input col and replaces all substring matches with the corresponding value. Priority is given to longer keys and from left to right.

  167. class Threshold extends ImageTransformerStage

    Permalink

    Applies a threshold to each element of the image.

    Applies a threshold to each element of the image. Please refer to threshold for more information

  168. class TimeIntervalBatcher[T] extends Iterator[List[T]]

    Permalink
  169. class TimeIntervalMiniBatchTransformer extends Transformer with MiniBatchBase

    Permalink
  170. class Timer extends Estimator[TimerModel] with TimerParams with ComplexParamsWritable

    Permalink
  171. class TimerModel extends Model[TimerModel] with TimerParams with ConstructorWritable[TimerModel]

    Permalink
  172. trait TimerParams extends Wrappable

    Permalink
  173. class TrainClassifier extends Estimator[TrainedClassifierModel] with HasLabelCol with ComplexParamsWritable with HasFeaturesCol

    Permalink

    Trains a classification model.

    Trains a classification model. Featurizes the given data into a vector of doubles.

    Note the behavior of the reindex and labels parameters, the parameters interact as:

    reindex -> false labels -> false (Empty) Assume all double values, don't use metadata, assume natural ordering

    reindex -> true labels -> false (Empty) Index, use natural ordering of string indexer

    reindex -> false labels -> true (Specified) Assume user knows indexing, apply label values. Currently only string type supported.

    reindex -> true labels -> true (Specified) Validate labels matches column type, try to recast to label type, reindex label column

  174. abstract class TrainParams extends Serializable

    Permalink

    Defines the common Booster parameters passed to the LightGBM learners.

  175. class TrainRegressor extends Estimator[TrainedRegressorModel] with HasLabelCol with MMLParams with ComplexParamsWritable

    Permalink

    Trains a regression model.

  176. class TrainedClassifierModel extends Model[TrainedClassifierModel] with ConstructorWritable[TrainedClassifierModel]

    Permalink

    Model produced by TrainClassifier.

  177. class TrainedRegressorModel extends Model[TrainedRegressorModel] with ConstructorWritable[TrainedRegressorModel]

    Permalink

    Model produced by TrainRegressor.

  178. class Trie extends Serializable

    Permalink
  179. class TuneHyperparameters extends Estimator[TuneHyperparametersModel] with Wrappable with ComplexParamsWritable with HasEvaluationMetric

    Permalink

    Tunes model hyperparameters

    Tunes model hyperparameters

    Allows user to specify multiple untrained models to tune using various search strategies. Currently supports cross validation with random grid search.

    Annotations
    @InternalWrapper()
  180. class TuneHyperparametersModel extends Model[TuneHyperparametersModel] with ConstructorWritable[TuneHyperparametersModel]

    Permalink

    Model produced by TuneHyperparameters.

    Model produced by TuneHyperparameters.

    Annotations
    @InternalWrapper()
  181. trait TypeConversion extends AnyRef

    Permalink
  182. class UDFTransformer extends Transformer with Wrappable with ComplexParamsWritable with HasInputCol with HasInputCols with HasOutputCol

    Permalink

    UDFTransformer takes as input input column, output column, and a UserDefinedFunction returns a dataframe comprised of the original columns with the output column as the result of the udf applied to the input column

    UDFTransformer takes as input input column, output column, and a UserDefinedFunction returns a dataframe comprised of the original columns with the output column as the result of the udf applied to the input column

    Annotations
    @InternalWrapper()
  183. class UnrollImage extends Transformer with HasInputCol with HasOutputCol with MMLParams

    Permalink

    Converts the representation of an m X n pixel image to an m * n vector of Doubles

    Converts the representation of an m X n pixel image to an m * n vector of Doubles

    The input column name is assumed to be "image", the output column name is "<uid>_output"

  184. class ValueIndexer extends Estimator[ValueIndexerModel] with ValueIndexerParams

    Permalink

    Fits a dictionary of values from the input column.

    Fits a dictionary of values from the input column. Model then transforms a column to a categorical column of the given array of values. Similar to StringIndexer except it can be used on any value types.

  185. class ValueIndexerModel extends Model[ValueIndexerModel] with ValueIndexerParams with ComplexParamsWritable

    Permalink

    Model produced by ValueIndexer.

  186. trait ValueIndexerParams extends MMLParams with HasInputCol with HasOutputCol

    Permalink
  187. abstract class VectorAssembler extends AnyRef

    Permalink
  188. trait Wrappable extends Params

    Permalink

Value Members

  1. object AdvancedHTTPHandling extends HandlingUtils with org.apache.spark.internal.Logging

    Permalink
  2. object AssembleFeatures extends DefaultParamsReadable[AssembleFeatures] with Serializable

    Permalink
  3. object AssembleFeaturesModel extends ConstructorReadable[AssembleFeaturesModel] with Serializable

    Permalink
  4. object AsyncUtils

    Permalink
  5. object BasicHTTPHandling extends HandlingUtils

    Permalink
  6. object Benchmark extends Serializable

    Permalink
  7. object BestModel extends ConstructorReadable[BestModel] with Serializable

    Permalink
  8. object Binary

    Permalink

    Implicit conversion allows sparkSession.readImages(...) syntax Example: import com.microsoft.ml.spark.Readers.implicits._ sparkSession.readImages(path, recursive = false)

  9. object BinaryFileReader

    Permalink
  10. object Blur extends Serializable

    Permalink
  11. object CNTKLearner extends DefaultParamsReadable[CNTKLearner] with Serializable

    Permalink
  12. object CNTKModel extends ComplexParamsReadable[CNTKModel] with Serializable

    Permalink
  13. object Cacher extends DefaultParamsReadable[Cacher] with Serializable

    Permalink
  14. object CastUtilities

    Permalink

    Utilities for casting values.

  15. object CheckpointData extends DefaultParamsReadable[CheckpointData] with Serializable

    Permalink

    Cache the dataset to memory or memory and disk.

  16. object ClassBalancer extends DefaultParamsReadable[ClassBalancer] with Serializable

    Permalink
  17. object ClassBalancerModel extends ConstructorReadable[ClassBalancerModel] with Serializable

    Permalink
  18. object CleanMissingData extends DefaultParamsReadable[CleanMissingData] with Serializable

    Permalink
  19. object CleanMissingDataModel extends ConstructorReadable[CleanMissingDataModel] with Serializable

    Permalink
  20. object ColorFormat extends Serializable

    Permalink
  21. object ColumnOptions extends Enumeration

    Permalink

    Specifies the column types supported in spark dataframes and modules.

  22. object ComputeModelStatistics extends DefaultParamsReadable[ComputeModelStatistics] with Serializable

    Permalink
  23. object ComputePerInstanceStatistics extends DefaultParamsReadable[ComputePerInstanceStatistics] with Serializable

    Permalink
  24. object ConversionUtils

    Permalink
  25. object CropImage extends Serializable

    Permalink
  26. object CustomInputParser extends ComplexParamsReadable[CustomInputParser] with Serializable

    Permalink
  27. object CustomOutputParser extends ComplexParamsReadable[CustomOutputParser] with Serializable

    Permalink
  28. object DataConversion extends DefaultParamsReadable[DataConversion] with Serializable

    Permalink

    DataConversion object.

  29. object DataOptions extends Enumeration

    Permalink

    Specifies the data types supported in spark dataframes and modules.

  30. object DataTransferUtils

    Permalink

    Utilities for reducing data to CNTK format and generating the text file output to disk.

  31. object DatasetOptions extends Serializable

    Permalink
  32. object DatasetUtils

    Permalink
  33. object DefaultHyperparams

    Permalink

    Provides good default hyperparameter ranges and values for sweeping.

    Provides good default hyperparameter ranges and values for sweeping. Publicly visible to users so they can easily select the parameters for sweeping.

  34. object DropColumns extends DefaultParamsReadable[DropColumns] with Serializable

    Permalink
  35. object DynamicMiniBatchTransformer extends DefaultParamsReadable[DynamicMiniBatchTransformer] with Serializable

    Permalink
  36. object EnsembleByKey extends DefaultParamsReadable[EnsembleByKey] with Serializable

    Permalink
  37. object EntityData extends SparkBindings[EntityData]

    Permalink
  38. object EnvironmentUtils

    Permalink
  39. object ErrorUtils extends Serializable

    Permalink
  40. object EvaluationUtils

    Permalink
  41. object Explode extends DefaultParamsReadable[Explode] with Serializable

    Permalink
  42. object FaultToleranceUtils

    Permalink
  43. object Featurize extends DefaultParamsReadable[Featurize] with Serializable

    Permalink
  44. object FileUtilities

    Permalink
  45. object FindBestModel extends ComplexParamsReadable[FindBestModel] with Serializable

    Permalink
  46. object FixedMiniBatchTransformer extends DefaultParamsReadable[FixedMiniBatchTransformer] with Serializable

    Permalink
  47. object FlattenBatch extends DefaultParamsReadable[FlattenBatch] with Serializable

    Permalink
  48. object Flip extends Serializable

    Permalink
  49. object FluentAPI

    Permalink
  50. object GaussianKernel extends Serializable

    Permalink
  51. object GenerateDataset

    Permalink

    Defines methods to generate a random spark DataFrame dataset based on given options.

  52. object HTTPRequestData extends SparkBindings[HTTPRequestData]

    Permalink
  53. object HTTPResponseData extends SparkBindings[HTTPResponseData]

    Permalink
  54. object HTTPSchema

    Permalink
  55. object HTTPTransformer extends ComplexParamsReadable[HTTPTransformer] with Serializable

    Permalink
  56. object HeaderData extends SparkBindings[HeaderData]

    Permalink
  57. object HyperParamUtils

    Permalink
  58. object Image

    Permalink

    Implicit conversion allows sparkSession.readImages(...) syntax Example: import com.microsoft.ml.spark.Readers.implicits._ sparkSession.readImages(path, recursive = false)

  59. object ImageFeaturizer extends ComplexParamsReadable[ImageFeaturizer] with Serializable

    Permalink
  60. object ImageReader

    Permalink
  61. object ImageSetAugmenter extends DefaultParamsReadable[ImageSetAugmenter] with Serializable

    Permalink
  62. object ImageTransformer extends DefaultParamsReadable[ImageTransformer] with Serializable

    Permalink

    Pipelined image processing.

  63. object ImageWriter

    Permalink
  64. object IndexToValue extends DefaultParamsReadable[IndexToValue] with Serializable

    Permalink
  65. object JSONInputParser extends ComplexParamsReadable[JSONInputParser] with Serializable

    Permalink
  66. object JSONOutputParser extends ComplexParamsReadable[JSONOutputParser] with Serializable

    Permalink
  67. object JarLoadingUtils

    Permalink

    Contains logic for loading classes.

  68. object Lambda extends ComplexParamsReadable[Lambda] with Serializable

    Permalink
  69. object LightGBMClassificationModel extends ConstructorReadable[LightGBMClassificationModel] with Serializable

    Permalink
  70. object LightGBMClassifier extends DefaultParamsReadable[LightGBMClassifier] with Serializable

    Permalink
  71. object LightGBMConstants

    Permalink
  72. object LightGBMRegressionModel extends ConstructorReadable[LightGBMRegressionModel] with Serializable

    Permalink
  73. object LightGBMRegressor extends DefaultParamsReadable[LightGBMRegressor] with Serializable

    Permalink
  74. object LightGBMUtils

    Permalink

    Helper utilities for LightGBM learners

  75. object Logging

    Permalink
  76. object MMLConfig

    Permalink
  77. object MultiColumnAdapter extends ComplexParamsReadable[MultiColumnAdapter] with Serializable

    Permalink
  78. object MultiNGram extends DefaultParamsReadable[MultiNGram] with Serializable

    Permalink
  79. object MultiVectorAssembler

    Permalink
  80. object NullOrdering extends Serializable

    Permalink
  81. object PSConstants

    Permalink

    Constants for PartitionSample.

    Constants for PartitionSample.

  82. object PartitionConsolidator extends DefaultParamsReadable[PartitionConsolidator] with Serializable

    Permalink
  83. object PartitionSample extends DefaultParamsReadable[PartitionSample] with Serializable

    Permalink
  84. object PowerBIWriter

    Permalink
  85. object ProcessUtils

    Permalink
  86. object ProtocolVersionData extends SparkBindings[ProtocolVersionData]

    Permalink
  87. object Readers

    Permalink

    Implicit conversion allows sparkSession.readImages(...) syntax Example: import com.microsoft.ml.spark.Readers.implicits._ sparkSession.readImages(path, recursive = false)

  88. object RenameColumn extends DefaultParamsReadable[RenameColumn] with Serializable

    Permalink
  89. object Repartition extends DefaultParamsReadable[Repartition] with Serializable

    Permalink
  90. object RequestLineData extends SparkBindings[RequestLineData]

    Permalink
  91. object ResizeImage extends Serializable

    Permalink

    Resize object contains the information for resizing; "height" "width" "stageName" = "resize"

  92. object SelectColumns extends DefaultParamsReadable[SelectColumns] with Serializable

    Permalink
  93. object ServingImplicits

    Permalink
  94. object SharedSingleton extends Serializable

    Permalink
  95. object SharedVariable extends Serializable

    Permalink
  96. object SimpleHTTPTransformer extends ComplexParamsReadable[SimpleHTTPTransformer] with Serializable

    Permalink
  97. object SparkSessionFactory

    Permalink
  98. object StatusLineData extends SparkBindings[StatusLineData]

    Permalink
  99. object StreamUtilities

    Permalink
  100. object SummarizeData extends DefaultParamsReadable[SummarizeData] with Serializable

    Permalink
  101. object TestBase extends Serializable

    Permalink
  102. object TextFeaturizer extends DefaultParamsReadable[TextFeaturizer] with Serializable

    Permalink
  103. object TextFeaturizerModel extends ConstructorReadable[TextFeaturizerModel] with Serializable

    Permalink
  104. object TextPreprocessor extends ComplexParamsReadable[TextPreprocessor] with Serializable

    Permalink
  105. object Threshold extends Serializable

    Permalink
  106. object TimeIntervalMiniBatchTransformer extends DefaultParamsReadable[TimeIntervalMiniBatchTransformer] with Serializable

    Permalink
  107. object Timer extends ComplexParamsReadable[Timer] with Serializable

    Permalink
  108. object TimerModel extends ConstructorReadable[TimerModel] with Serializable

    Permalink
  109. object TrainClassifier extends ComplexParamsReadable[TrainClassifier] with Serializable

    Permalink
  110. object TrainRegressor extends ComplexParamsReadable[TrainRegressor] with Serializable

    Permalink
  111. object TrainedClassifierModel extends ConstructorReadable[TrainedClassifierModel] with Serializable

    Permalink
  112. object TrainedRegressorModel extends ConstructorReadable[TrainedRegressorModel] with Serializable

    Permalink
  113. object Trie extends Serializable

    Permalink
  114. object TuneHyperparameters extends ComplexParamsReadable[TuneHyperparameters] with Serializable

    Permalink
  115. object TuneHyperparametersModel extends ConstructorReadable[TuneHyperparametersModel] with Serializable

    Permalink
  116. object TypeMapping

    Permalink
  117. object UDFTransformer extends ComplexParamsReadable[UDFTransformer] with Serializable

    Permalink
  118. object UnrollImage extends DefaultParamsReadable[UnrollImage] with Serializable

    Permalink
  119. object ValueIndexer extends DefaultParamsReadable[ValueIndexer] with Serializable

    Permalink
  120. object ValueIndexerModel extends ComplexParamsReadable[ValueIndexerModel] with Serializable

    Permalink
  121. package codegen

    Permalink
  122. package contracts

    Permalink
  123. package hadoop

    Permalink
  124. package metrics

    Permalink
  125. package schema

    Permalink
  126. object udfs

    Permalink

Inherited from AnyRef

Inherited from Any

Members