SpeechToText

class SpeechToText.SpeechToText(audioData=None, concurrency=1, concurrentTimeout=100.0, errorCol=None, format=None, handler=None, language=None, outputCol=None, profanity=None, subscriptionKey=None, timeout=60.0, url=None)[source]

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters:
  • audioData (object) – The data sent to the service must be a .wav files
  • concurrency (int) – max number of concurrent calls (default: 1)
  • concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
  • errorCol (str) – column to hold http errors (default: [self.uid]_error)
  • format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
  • handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
  • language (object) – Identifies the spoken language that is being recognized.
  • outputCol (str) – The name of the output column (default: [self.uid]_output)
  • profanity (object) – Specifies how to handle profanity in recognition results.Accepted values are masked, which replaces profanity with asterisks,removed, which remove all profanity from the result, or raw,which includes the profanity in the result. The default setting is masked.
  • subscriptionKey (object) – the API key to use
  • timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
  • url (str) – Url of the service
getAudioData()[source]
Returns:The data sent to the service must be a .wav files
Return type:object
getConcurrency()[source]
Returns:max number of concurrent calls (default: 1)
Return type:int
getConcurrentTimeout()[source]
Returns:max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
Return type:double
getErrorCol()[source]
Returns:column to hold http errors (default: [self.uid]_error)
Return type:str
getFormat()[source]
Returns:Specifies the result format. Accepted values are simple and detailed. Default is simple.
Return type:object
getHandler()[source]
Returns:Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
Return type:object
static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns:Identifies the spoken language that is being recognized.
Return type:object
getOutputCol()[source]
Returns:The name of the output column (default: [self.uid]_output)
Return type:str
getProfanity()[source]
Returns:Specifies how to handle profanity in recognition results.Accepted values are masked, which replaces profanity with asterisks,removed, which remove all profanity from the result, or raw,which includes the profanity in the result. The default setting is masked.
Return type:object
getSubscriptionKey()[source]
Returns:the API key to use
Return type:object
getTimeout()[source]
Returns:number of seconds to wait before closing the connection (default: 60.0)
Return type:double
getUrl()[source]
Returns:Url of the service
Return type:str
classmethod read()[source]

Returns an MLReader instance for this class.

setAudioData(value)[source]
Parameters:audioData (object) – The data sent to the service must be a .wav files
setAudioDataCol(value)[source]
Parameters:audioData (object) – The data sent to the service must be a .wav files
setConcurrency(value)[source]
Parameters:concurrency (int) – max number of concurrent calls (default: 1)
setConcurrentTimeout(value)[source]
Parameters:concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
setErrorCol(value)[source]
Parameters:errorCol (str) – column to hold http errors (default: [self.uid]_error)
setFormat(value)[source]
Parameters:format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
setFormatCol(value)[source]
Parameters:format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
setHandler(value)[source]
Parameters:handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
setLanguage(value)[source]
Parameters:language (object) – Identifies the spoken language that is being recognized.
setLanguageCol(value)[source]
Parameters:language (object) – Identifies the spoken language that is being recognized.
setOutputCol(value)[source]
Parameters:outputCol (str) – The name of the output column (default: [self.uid]_output)
setParams(audioData=None, concurrency=1, concurrentTimeout=100.0, errorCol=None, format=None, handler=None, language=None, outputCol=None, profanity=None, subscriptionKey=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

Parameters:
  • audioData (object) – The data sent to the service must be a .wav files
  • concurrency (int) – max number of concurrent calls (default: 1)
  • concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
  • errorCol (str) – column to hold http errors (default: [self.uid]_error)
  • format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
  • handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
  • language (object) – Identifies the spoken language that is being recognized.
  • outputCol (str) – The name of the output column (default: [self.uid]_output)
  • profanity (object) – Specifies how to handle profanity in recognition results.Accepted values are masked, which replaces profanity with asterisks,removed, which remove all profanity from the result, or raw,which includes the profanity in the result. The default setting is masked.
  • subscriptionKey (object) – the API key to use
  • timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
  • url (str) – Url of the service
setProfanity(value)[source]
Parameters:profanity (object) – Specifies how to handle profanity in recognition results.Accepted values are masked, which replaces profanity with asterisks,removed, which remove all profanity from the result, or raw,which includes the profanity in the result. The default setting is masked.
setProfanityCol(value)[source]
Parameters:profanity (object) – Specifies how to handle profanity in recognition results.Accepted values are masked, which replaces profanity with asterisks,removed, which remove all profanity from the result, or raw,which includes the profanity in the result. The default setting is masked.
setSubscriptionKey(value)[source]
Parameters:subscriptionKey (object) – the API key to use
setSubscriptionKeyCol(value)[source]
Parameters:subscriptionKey (object) – the API key to use
setTimeout(value)[source]
Parameters:timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
setUrl(value)[source]
Parameters:url (str) – Url of the service