I am looking for data containing specific words through snorkel(https://www.snorkel.org/use-cases/01-spam-tutorial) from dataframe df.

df

not_matter Text
111 hallo Apple
222 Berry and bb
333 bb and Candy

Now i have a pandas dataframe df_wordlist where column_1 and 2 are different words and column_3 is a combination of columns 1 and 2.

df_wordlist

Column_1 Column_2 Column_3
aa Apple aa_Apple
aa Berry aa_Berry
aa Candy aa_Candy
bb Apple bb_Apple
bb Berry bb_Berry
bb Candy bb_Candy

I now need to define different label functions, and I want the names of these functions to be the values in column_3, and the contents of the funtion to be the values in column_1 and column_2.

@labeling_function()
def aa_Apple(x):
    return FOERD if re.search(r"\b(?=.*aa.*)(?=.*Apple.*)\b|\b(?=.*Apple.*)(?=.*aa.*)\b", df.Text, flags=re.I) else ABSTAIN

@labeling_function()
def aa_Berry(x):
    return FOERD if re.search(r"\b(?=.*aa.*)(?=.*Berry.*)\b|\b(?=.*Berry.*)(?=.*aa.*)\b", df.Text, flags=re.I) else ABSTAIN 

.......the other 3 functions.....

@labeling_function()
def bb_Candy(x):
    return FOERD if re.search(r"\b(?=.*bb.*)(?=.*Candy.*)\b|\b(?=.*Candy.*)(?=.*bb.*)\b", df.Text, flags=re.I) else ABSTAIN

I tried to do this with loop but it didn't work.

for i in range(len(df_wordlist)):
    label_name = str(df_wordlist.iloc[i,-1])
    label_word1 = str(df_wordlist.iloc[i,0])
    label_word2 = str(df_wordlist.iloc[i,1])

    @labeling_function()
    def label_name(x):
        return FOERD if re.search(r"\b(?=.*label_word1.*)(?=.*label_word2.*)\b|\b(?=.*label_word2.*)(?=.*label_word1.*)\b", df, flags=re.I) else ABSTAIN

I want through a loop generate so many label functions, like the length of the df_wordlist.

Latter I need to put all the functions in a list to be invoked, like:

function_ls = [aa_Apple, aa_Berry, bb_Candy]

In a loop, it should be:

for i in range(len(df_wordlist)):
    label_name = str(df_wordlist.iloc[i,-1])
    function_ls= []
    function_ls = function_ls.append(label_name)
    return function_ls
0

There are 0 best solutions below