Snorkel: write several labelling functions automatically

211 Views Asked by At

My goal is to create N labelling functions (LFs in snorkel package) using more elegant way than writing it one by one. Since I'm expecting to have much more single regexes to be referenced, I would like to find a way to do it more automatically. Below I present my use case.

@labeling_function() 
def regex_give(x):
    return FEATURE if re.search(r"give", x, flags=re.I) else ABSTAIN
@labeling_function()
def regex_note(x):
    return FEATURE if re.search(r"note", x, flags=re.I) else ABSTAIN   
@labeling_function()
def regex_pay(x):
    return FEATURE if re.search(r"pay", x, flags=re.I) else ABSTAIN

lfs = [regex_give, regex_note, regex_pay]

applier = LFApplier(lfs=lfs)
L_train = applier.apply(df.text)

LFAnalysis(L=L_train, lfs=lfs).lf_summary(df.feat_flg)

Is there any way to define such labelling functions in for loop or using any different approach?

I'm using the following tutorial: https://www.snorkel.org/use-cases/01-spam-tutorial#4-combining-labeling-function-outputs-with-the-label-model.

1

There are 1 best solutions below

0
On

The simplest way might be to create a factory function:

def factory(regex):

    @labeling_function()
    def regex_labeler(x):
        return FEATURE if re.search(regex, x, flag=re.I) else ABSTAIN

    return regex_labeler

label_funcs = [factory(i) for i in [r"pay", r"give", r"note"]]

The main problem with this approach being that you lose some interpretability with the function names, unless you assign the returned value from the factory to a more meaningfully-named variable.

give_func = factory(r"give")
# etc...