the cycle says that all rows have the '|'

67 Views Asked by At
import pandas as pd
import numpy as np
import re

df_test = pd.DataFrame(np.array([['a|b', 'b', 'c|r'], [ 'e', 'f', 'g']]), columns=['First', 'Second', 'Third'])

for elem in df_test.get('First'):
    x = bool(re.search('|', elem))
    if x == True:
        print(elem)

Output:

a|b
e

Why does it show like that? It's considered that only the first row must be in output. No?

1

There are 1 best solutions below

0
Srijon kumar On BEST ANSWER

The issue in your code is with the regular expression pattern and how it interacts with the re.search function. In regular expressions, the vertical bar | is a special character that represents an OR operator. So when you use re.search('|', elem), it is essentially searching for an empty string or any character in the input elem.So, You need to escape the vertical bar using a backslash \ because you want to match the literal character |.

import pandas as pd
import numpy as np
import re

df_test = pd.DataFrame(np.array([['a|b', 'b', 'c|r'], ['e', 'f', 'g']]), columns=['First', 'Second', 'Third'])

for elem in df_test.get('First'):
    x = bool(re.search('\|', elem))
    if x == True:
        print(elem)