SimpleHTTPTransformer

class SimpleHTTPTransformer.SimpleHTTPTransformer(backoffTiming='[I@4c0d1df8', concurrency=1, concurrentTimeout=100.0, errorCol=None, flattenOutputBatches=False, handlingStrategy='advanced', inputCol=None, inputParser=None, miniBatcher=None, outputCol=None, outputParser=None)[source]

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters:
  • backoffTiming (object) – times to use in backoffs (default: [I@4c0d1df8)
  • concurrency (int) – max number of concurrent calls (default: 1)
  • concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
  • errorCol (str) – column to hold http errors (default: [self.uid]_errors)
  • flattenOutputBatches (bool) – whether to flatten the output batches (default: false)
  • handlingStrategy (str) – Which strategy to use when handling requests (default: advanced)
  • inputCol (str) – The name of the input column
  • inputParser (object) – format to parse the column to (default: JSONInputParser_e0390bd764aa)
  • miniBatcher (object) – Minibatcher to use
  • outputCol (str) – The name of the output column
  • outputParser (object) – format to parse the column to
getBackoffTiming()[source]
Returns:times to use in backoffs (default: [I@4c0d1df8)
Return type:object
getConcurrency()[source]
Returns:max number of concurrent calls (default: 1)
Return type:int
getConcurrentTimeout()[source]
Returns:max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
Return type:double
getErrorCol()[source]
Returns:column to hold http errors (default: [self.uid]_errors)
Return type:str
getFlattenOutputBatches()[source]
Returns:whether to flatten the output batches (default: false)
Return type:bool
getHandlingStrategy()[source]
Returns:Which strategy to use when handling requests (default: advanced)
Return type:str
getInputCol()[source]
Returns:The name of the input column
Return type:str
getInputParser()[source]
Returns:format to parse the column to (default: JSONInputParser_e0390bd764aa)
Return type:object
static getJavaPackage()[source]

Returns package name String.

getMiniBatcher()[source]
Returns:Minibatcher to use
Return type:object
getOutputCol()[source]
Returns:The name of the output column
Return type:str
getOutputParser()[source]
Returns:format to parse the column to
Return type:object
classmethod read()[source]

Returns an MLReader instance for this class.

setBackoffTiming(value)[source]
Parameters:backoffTiming (object) – times to use in backoffs (default: [I@4c0d1df8)
setConcurrency(value)[source]
Parameters:concurrency (int) – max number of concurrent calls (default: 1)
setConcurrentTimeout(value)[source]
Parameters:concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
setErrorCol(value)[source]
Parameters:errorCol (str) – column to hold http errors (default: [self.uid]_errors)
setFlattenOutputBatches(value)[source]
Parameters:flattenOutputBatches (bool) – whether to flatten the output batches (default: false)
setHandlingStrategy(value)[source]
Parameters:handlingStrategy (str) – Which strategy to use when handling requests (default: advanced)
setInputCol(value)[source]
Parameters:inputCol (str) – The name of the input column
setInputParser(value)[source]
Parameters:inputParser (object) – format to parse the column to (default: JSONInputParser_e0390bd764aa)
setMiniBatcher(value)[source]
Parameters:miniBatcher (object) – Minibatcher to use
setOutputCol(value)[source]
Parameters:outputCol (str) – The name of the output column
setOutputParser(value)[source]
Parameters:outputParser (object) – format to parse the column to
setParams(backoffTiming='[I@4c0d1df8', concurrency=1, concurrentTimeout=100.0, errorCol=None, flattenOutputBatches=False, handlingStrategy='advanced', inputCol=None, inputParser=None, miniBatcher=None, outputCol=None, outputParser=None)[source]

Set the (keyword only) parameters

Parameters:
  • backoffTiming (object) – times to use in backoffs (default: [I@4c0d1df8)
  • concurrency (int) – max number of concurrent calls (default: 1)
  • concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
  • errorCol (str) – column to hold http errors (default: [self.uid]_errors)
  • flattenOutputBatches (bool) – whether to flatten the output batches (default: false)
  • handlingStrategy (str) – Which strategy to use when handling requests (default: advanced)
  • inputCol (str) – The name of the input column
  • inputParser (object) – format to parse the column to (default: JSONInputParser_e0390bd764aa)
  • miniBatcher (object) – Minibatcher to use
  • outputCol (str) – The name of the output column
  • outputParser (object) – format to parse the column to