String Containing Exact Substring from Substring List

1.8k Views Asked by At

Scala beginner here, I'm trying to find all the tweets text that contain at least one keyword in the list of keywords given.

Where a tweet:

case class Tweet(user: String, text: String, retweets: Int)

With an example Tweet("user1", "apple apple", 3)

Given that wordInTweet should return true if at least one keyword in the list keywords can be found in the tweet's text.

I tried implementing it like the following:

def wordInTweet(tweet: Tweet, keywords: List[String]): Boolean = {
    keywords.exists(tweet.text.equals(_))
}

But, it also returns true if a tweet's text is music and a given keyword's text is musica.

I'm struggling to find a way to return true ONLY if the tweets contains the exact same keyword's text.

How can I achieve this?

Thanks in advance.

1

There are 1 best solutions below

2
On BEST ANSWER

First, it would help if you consider the keywords as a Set, given that sets have a very efficient belongs function.

keywords: Set[String]

Then we need to test every word in the tweet, as opposed to the complete text. This means that we need to split the text into words. We find an example of that everywhere with the ubiquitous "wordCount" example.

val wordsInTweet = tweet.text.split("\\W")

Next, we put things together:

def wordInTweet(tweet: Tweet, keywords: Set[String]): Boolean = {
   val wordsInTweet = tweet.text.split("\\W")
   wordsInTweet.exists(word => keywords.contains(word))
}