RecognizeText¶
-
class
RecognizeText.RecognizeText(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=100.0, errorCol=None, imageBytes=None, imageUrl=None, maxPollingRetries=1000, mode=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.Utils.ComplexParamsMixin,pyspark.ml.util.JavaMLReadable,pyspark.ml.util.JavaMLWritable,pyspark.ml.wrapper.JavaTransformerParameters: - backoffs (list) – array of backoffs to use in the handler (default: [I@41128e91)
- concurrency (int) – max number of concurrent calls (default: 1)
- concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- errorCol (str) – column to hold http errors (default: [self.uid]_error)
- imageBytes (object) – bytestream of the image to use
- imageUrl (object) – the url of the image to use
- maxPollingRetries (int) – number of times to poll (default: 1000)
- mode (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- outputCol (str) – The name of the output column (default: [self.uid]_output)
- subscriptionKey (object) – the API key to use
- timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
- url (str) – Url of the service
-
getBackoffs()[source]¶ Returns: array of backoffs to use in the handler (default: [I@41128e91) Return type: list
-
getConcurrentTimeout()[source]¶ Returns: max number seconds to wait on futures if concurrency >= 1 (default: 100.0) Return type: double
-
getErrorCol()[source]¶ Returns: column to hold http errors (default: [self.uid]_error) Return type: str
-
getMode()[source]¶ Returns: If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed Return type: object
-
getOutputCol()[source]¶ Returns: The name of the output column (default: [self.uid]_output) Return type: str
-
getTimeout()[source]¶ Returns: number of seconds to wait before closing the connection (default: 60.0) Return type: double
-
setBackoffs(value)[source]¶ Parameters: backoffs (list) – array of backoffs to use in the handler (default: [I@41128e91)
-
setConcurrency(value)[source]¶ Parameters: concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout(value)[source]¶ Parameters: concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol(value)[source]¶ Parameters: errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setMaxPollingRetries(value)[source]¶ Parameters: maxPollingRetries (int) – number of times to poll (default: 1000)
-
setMode(value)[source]¶ Parameters: mode (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
-
setModeCol(value)[source]¶ Parameters: mode (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
-
setOutputCol(value)[source]¶ Parameters: outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=100.0, errorCol=None, imageBytes=None, imageUrl=None, maxPollingRetries=1000, mode=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
Parameters: - backoffs (list) – array of backoffs to use in the handler (default: [I@41128e91)
- concurrency (int) – max number of concurrent calls (default: 1)
- concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- errorCol (str) – column to hold http errors (default: [self.uid]_error)
- imageBytes (object) – bytestream of the image to use
- imageUrl (object) – the url of the image to use
- maxPollingRetries (int) – number of times to poll (default: 1000)
- mode (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- outputCol (str) – The name of the output column (default: [self.uid]_output)
- subscriptionKey (object) – the API key to use
- timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
- url (str) – Url of the service