KeyPhraseExtractor

class KeyPhraseExtractor.KeyPhraseExtractor(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]

Bases: mmlspark.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters:
  • concurrency (int) – max number of concurrent calls (default: 1)
  • concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
  • errorCol (str) – column to hold http errors (default: [self.uid]_error)
  • handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
  • language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
  • outputCol (str) – The name of the output column (default: [self.uid]_output)
  • subscriptionKey (object) – the API key to use
  • text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
  • timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
  • url (str) – Url of the service
getConcurrency()[source]
Returns:max number of concurrent calls (default: 1)
Return type:int
getConcurrentTimeout()[source]
Returns:max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
Return type:double
getErrorCol()[source]
Returns:column to hold http errors (default: [self.uid]_error)
Return type:str
getHandler()[source]
Returns:Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
Return type:object
static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns:the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
Return type:object
getOutputCol()[source]
Returns:The name of the output column (default: [self.uid]_output)
Return type:str
getSubscriptionKey()[source]
Returns:the API key to use
Return type:object
getText()[source]
Returns:the text in the request body (default: ServiceParamData(Some(Right(text)),None))
Return type:object
getTimeout()[source]
Returns:number of seconds to wait before closing the connection (default: 60.0)
Return type:double
getUrl()[source]
Returns:Url of the service
Return type:str
classmethod read()[source]

Returns an MLReader instance for this class.

setConcurrency(value)[source]
Parameters:concurrency (int) – max number of concurrent calls (default: 1)
setConcurrentTimeout(value)[source]
Parameters:concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
setErrorCol(value)[source]
Parameters:errorCol (str) – column to hold http errors (default: [self.uid]_error)
setHandler(value)[source]
Parameters:handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
setLanguage(value)[source]
Parameters:language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
setLanguageCol(value)[source]
Parameters:language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
setOutputCol(value)[source]
Parameters:outputCol (str) – The name of the output column (default: [self.uid]_output)
setParams(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

Parameters:
  • concurrency (int) – max number of concurrent calls (default: 1)
  • concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
  • errorCol (str) – column to hold http errors (default: [self.uid]_error)
  • handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
  • language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
  • outputCol (str) – The name of the output column (default: [self.uid]_output)
  • subscriptionKey (object) – the API key to use
  • text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
  • timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
  • url (str) – Url of the service
setSubscriptionKey(value)[source]
Parameters:subscriptionKey (object) – the API key to use
setSubscriptionKeyCol(value)[source]
Parameters:subscriptionKey (object) – the API key to use
setText(value)[source]
Parameters:text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
setTextCol(value)[source]
Parameters:text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
setTimeout(value)[source]
Parameters:timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
setUrl(value)[source]
Parameters:url (str) – Url of the service