SpeechToText¶

class SpeechToText.SpeechToText(audioData=None, concurrency=1, concurrentTimeout=100.0, errorCol=None, format=None, handler=None, language=None, outputCol=None, profanity=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters:

audioData (object) – The data sent to the service must be a .wav files
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – Identifies the spoken language that is being recognized.
outputCol (str) – The name of the output column (default: [self.uid]_output)
profanity (object) – Specifies how to handle profanity in recognition results.Accepted values are masked, which replaces profanity with asterisks,removed, which remove all profanity from the result, or raw,which includes the profanity in the result. The default setting is masked.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service

getAudioData()[source]¶

Returns:	The data sent to the service must be a .wav files
Return type:	object

getConcurrency()[source]¶

Returns:	max number of concurrent calls (default: 1)
Return type:	int

getConcurrentTimeout()[source]¶

Returns:	max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
Return type:	double

getErrorCol()[source]¶

Returns:	column to hold http errors (default: [self.uid]_error)
Return type:	str

getFormat()[source]¶

Returns:	Specifies the result format. Accepted values are simple and detailed. Default is simple.
Return type:	object

getHandler()[source]¶

Returns:	Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
Return type:	object

static getJavaPackage()[source]¶: Returns package name String.

getLanguage()[source]¶

Returns:	Identifies the spoken language that is being recognized.
Return type:	object

getOutputCol()[source]¶

Returns:	The name of the output column (default: [self.uid]_output)
Return type:	str

getProfanity()[source]¶

Returns:	Specifies how to handle profanity in recognition results.Accepted values are masked, which replaces profanity with asterisks,removed, which remove all profanity from the result, or raw,which includes the profanity in the result. The default setting is masked.
Return type:	object

getSubscriptionKey()[source]¶

Returns:	the API key to use
Return type:	object

getTimeout()[source]¶

Returns:	number of seconds to wait before closing the connection (default: 60.0)
Return type:	double

getUrl()[source]¶

Returns:	Url of the service
Return type:	str

classmethod read()[source]¶: Returns an MLReader instance for this class.

setAudioData(value)[source]¶

Parameters:	audioData (object) – The data sent to the service must be a .wav files

setAudioDataCol(value)[source]¶

Parameters:	audioData (object) – The data sent to the service must be a .wav files

setConcurrency(value)[source]¶

Parameters:	concurrency (int) – max number of concurrent calls (default: 1)

setConcurrentTimeout(value)[source]¶

Parameters:	concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)

setErrorCol(value)[source]¶

Parameters:	errorCol (str) – column to hold http errors (default: [self.uid]_error)

setFormat(value)[source]¶

Parameters:	format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.

setFormatCol(value)[source]¶

Parameters:	format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.

setHandler(value)[source]¶

Parameters:	handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))

setLanguage(value)[source]¶

Parameters:	language (object) – Identifies the spoken language that is being recognized.

setLanguageCol(value)[source]¶

Parameters:	language (object) – Identifies the spoken language that is being recognized.

setOutputCol(value)[source]¶

Parameters:	outputCol (str) – The name of the output column (default: [self.uid]_output)

setParams(audioData=None, concurrency=1, concurrentTimeout=100.0, errorCol=None, format=None, handler=None, language=None, outputCol=None, profanity=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶

Set the (keyword only) parameters

Parameters:

audioData (object) – The data sent to the service must be a .wav files
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – Identifies the spoken language that is being recognized.
outputCol (str) – The name of the output column (default: [self.uid]_output)
profanity (object) – Specifies how to handle profanity in recognition results.Accepted values are masked, which replaces profanity with asterisks,removed, which remove all profanity from the result, or raw,which includes the profanity in the result. The default setting is masked.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service

setProfanity(value)[source]¶

Parameters:	profanity (object) – Specifies how to handle profanity in recognition results.Accepted values are masked, which replaces profanity with asterisks,removed, which remove all profanity from the result, or raw,which includes the profanity in the result. The default setting is masked.

setProfanityCol(value)[source]¶

Parameters:	profanity (object) – Specifies how to handle profanity in recognition results.Accepted values are masked, which replaces profanity with asterisks,removed, which remove all profanity from the result, or raw,which includes the profanity in the result. The default setting is masked.

setSubscriptionKey(value)[source]¶

Parameters:	subscriptionKey (object) – the API key to use

setSubscriptionKeyCol(value)[source]¶

Parameters:	subscriptionKey (object) – the API key to use

setTimeout(value)[source]¶

Parameters:	timeout (double) – number of seconds to wait before closing the connection (default: 60.0)

setUrl(value)[source]¶

Parameters:	url (str) – Url of the service