How does keyword extraction works?

261 Views Asked by Pedro At 29 November 2018 at 06:03

I tested the keyword extraction from the Natural Language Understanding service from IBM with the following text:

Desarrollo PDA. Ajustes PDA. Nuevo modulo PDA. Ajustes modulo PDA. No sincroniza PDA. Error modulo PDA.

And i got the following response:

modulo pda with 98.31% relevance
ajustes modulo pda with 64.44% relevance
nuevo modulo pda with 64.34 relevance

Now my question is why is "modulo pda" keyword relevance 98.31% and not just "PDA" with a higher relevance?. I've been searching everywhere about how does IBM works with no avail.

Original Q&A

There are 1 best solutions below

Manoj Singh On 30 November 2018 at 15:22 BEST ANSWER

The actual algorithm used to extract and score keywords would be a corporate proprietary recipe, I won't expect them to make it public. But you can find lot of research papers on that topic but usually the final commercial products would contain mix of different techniques to get the best results.

You can compare the different NLU services from different provides, like IBM, Google, Amazon and compare the results.

Specifically for your query, you are trying to extract keywords or topics from a single document. PDA occurs in every sentence in your document. If we apply a simple technique like TF-IDF where each sentence is a document, the the TF-IDF=0 for the word PDA since it occurs in every sentence and becomes irrelevant since its not adding an information to overall topic or document importance.

How does keyword extraction works?

There are 1 best solutions below

Related Questions in IBM-WATSON

Related Questions in WATSON-NLU

Trending Questions

Popular # Hahtags

Popular Questions