How do I get python's "in" operator to only yield true word matching, and not just substring matching?

2.2k Views Asked by At

Here is the desired output:

"bacillus thurungensis" in "bacillus thurungensis"
TRUE

"bacillus thurungensis" in "Sentence containing bacillus thurungensis."
TRUE

"bacillus thurungensis" in "Subspecies bacillus thurungensis34"
FALSE

"bacillus thurungensis" in "bacillus thurungensis, bacillus genus"
TRUE

"bacillus thurungensis" in "Notbacillus thurungensis, must match word"
FALSE

Python typically thinks that any substring match is good, but I'm not looking for that. I want some regex or match operator to only yield true IF AND ONLY IF it sees the query as a separate word in the subject and not just a substring. How is this implimentable?

1

There are 1 best solutions below

4
On BEST ANSWER

You can use regex instead:

re.match(r"\bbacillus thurungensis\b", "bacillus thurungensis")
re.match(r"\bbacillus thurungensis\b", "Sentence containing bacillus thurungensis.")
re.match(r"\bbacillus thurungensis\b", "Subspecies bacillus thurungensis34")

And so on.

The \b is a word boundary. Also note the usage of the r in the string r"...".

You can also use compile if you re going to use the regex over and over again:

import re
matcher = re.compile(r'\bbacillus thurungensis\b')
matcher.match("bacillus thurungensis")
# and so on