DetectAnomalies

class DetectAnomalies.DetectAnomalies(concurrency=1, concurrentTimeout=100.0, customInterval=None, errorCol=None, granularity=None, handler=None, maxAnomalyRatio=None, outputCol=None, period=None, sensitivity=None, series=None, subscriptionKey=None, timeout=60.0, url=None)[source]

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters:
  • concurrency (int) – max number of concurrent calls (default: 1)
  • concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
  • customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
  • errorCol (str) – column to hold http errors (default: [self.uid]_error)
  • granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely.Granularity is used for verify whether input series is valid.
  • handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
  • maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
  • outputCol (str) – The name of the output column (default: [self.uid]_output)
  • period (object) – Optional argument, periodic value of a time series.If the value is null or does not present, the API will determine the period automatically.
  • sensitivity (object) – Optional argument, advanced model parameter, between 0-99,the lower the value is, the larger the margin value will be which means less anomalies will be accepted
  • series (object) – Time series data points. Points should be sorted by timestamp in ascending orderto match the anomaly detection result. If the data is not sorted correctly orthere is duplicated timestamp, the API will not work.In such case, an error message will be returned.
  • subscriptionKey (object) – the API key to use
  • timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
  • url (str) – Url of the service
getConcurrency()[source]
Returns:max number of concurrent calls (default: 1)
Return type:int
getConcurrentTimeout()[source]
Returns:max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
Return type:double
getCustomInterval()[source]
Returns:Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
Return type:object
getErrorCol()[source]
Returns:column to hold http errors (default: [self.uid]_error)
Return type:str
getGranularity()[source]
Returns:Can only be one of yearly, monthly, weekly, daily, hourly or minutely.Granularity is used for verify whether input series is valid.
Return type:object
getHandler()[source]
Returns:Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
Return type:object
static getJavaPackage()[source]

Returns package name String.

getMaxAnomalyRatio()[source]
Returns:Optional argument, advanced model parameter, max anomaly ratio in a time series.
Return type:object
getOutputCol()[source]
Returns:The name of the output column (default: [self.uid]_output)
Return type:str
getPeriod()[source]
Returns:Optional argument, periodic value of a time series.If the value is null or does not present, the API will determine the period automatically.
Return type:object
getSensitivity()[source]
Returns:Optional argument, advanced model parameter, between 0-99,the lower the value is, the larger the margin value will be which means less anomalies will be accepted
Return type:object
getSeries()[source]
Returns:Time series data points. Points should be sorted by timestamp in ascending orderto match the anomaly detection result. If the data is not sorted correctly orthere is duplicated timestamp, the API will not work.In such case, an error message will be returned.
Return type:object
getSubscriptionKey()[source]
Returns:the API key to use
Return type:object
getTimeout()[source]
Returns:number of seconds to wait before closing the connection (default: 60.0)
Return type:double
getUrl()[source]
Returns:Url of the service
Return type:str
classmethod read()[source]

Returns an MLReader instance for this class.

setConcurrency(value)[source]
Parameters:concurrency (int) – max number of concurrent calls (default: 1)
setConcurrentTimeout(value)[source]
Parameters:concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
setCustomInterval(value)[source]
Parameters:customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
setCustomIntervalCol(value)[source]
Parameters:customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
setErrorCol(value)[source]
Parameters:errorCol (str) – column to hold http errors (default: [self.uid]_error)
setGranularity(value)[source]
Parameters:granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely.Granularity is used for verify whether input series is valid.
setGranularityCol(value)[source]
Parameters:granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely.Granularity is used for verify whether input series is valid.
setHandler(value)[source]
Parameters:handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
setMaxAnomalyRatio(value)[source]
Parameters:maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
setMaxAnomalyRatioCol(value)[source]
Parameters:maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
setOutputCol(value)[source]
Parameters:outputCol (str) – The name of the output column (default: [self.uid]_output)
setParams(concurrency=1, concurrentTimeout=100.0, customInterval=None, errorCol=None, granularity=None, handler=None, maxAnomalyRatio=None, outputCol=None, period=None, sensitivity=None, series=None, subscriptionKey=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

Parameters:
  • concurrency (int) – max number of concurrent calls (default: 1)
  • concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
  • customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
  • errorCol (str) – column to hold http errors (default: [self.uid]_error)
  • granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely.Granularity is used for verify whether input series is valid.
  • handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
  • maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
  • outputCol (str) – The name of the output column (default: [self.uid]_output)
  • period (object) – Optional argument, periodic value of a time series.If the value is null or does not present, the API will determine the period automatically.
  • sensitivity (object) – Optional argument, advanced model parameter, between 0-99,the lower the value is, the larger the margin value will be which means less anomalies will be accepted
  • series (object) – Time series data points. Points should be sorted by timestamp in ascending orderto match the anomaly detection result. If the data is not sorted correctly orthere is duplicated timestamp, the API will not work.In such case, an error message will be returned.
  • subscriptionKey (object) – the API key to use
  • timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
  • url (str) – Url of the service
setPeriod(value)[source]
Parameters:period (object) – Optional argument, periodic value of a time series.If the value is null or does not present, the API will determine the period automatically.
setPeriodCol(value)[source]
Parameters:period (object) – Optional argument, periodic value of a time series.If the value is null or does not present, the API will determine the period automatically.
setSensitivity(value)[source]
Parameters:sensitivity (object) – Optional argument, advanced model parameter, between 0-99,the lower the value is, the larger the margin value will be which means less anomalies will be accepted
setSensitivityCol(value)[source]
Parameters:sensitivity (object) – Optional argument, advanced model parameter, between 0-99,the lower the value is, the larger the margin value will be which means less anomalies will be accepted
setSeries(value)[source]
Parameters:series (object) – Time series data points. Points should be sorted by timestamp in ascending orderto match the anomaly detection result. If the data is not sorted correctly orthere is duplicated timestamp, the API will not work.In such case, an error message will be returned.
setSeriesCol(value)[source]
Parameters:series (object) – Time series data points. Points should be sorted by timestamp in ascending orderto match the anomaly detection result. If the data is not sorted correctly orthere is duplicated timestamp, the API will not work.In such case, an error message will be returned.
setSubscriptionKey(value)[source]
Parameters:subscriptionKey (object) – the API key to use
setSubscriptionKeyCol(value)[source]
Parameters:subscriptionKey (object) – the API key to use
setTimeout(value)[source]
Parameters:timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
setUrl(value)[source]
Parameters:url (str) – Url of the service