exclude negative words from nltk stopwords

215 Views Asked by At

I want to remove the nltk stopwords from my sentences except the ones that have negative meaning such as: no, not, couldn't etc. In other words, I want to exclude negative words from the stopwords' list. How can I do that?

1

There are 1 best solutions below

0
Anilosan15 On

There is no smooth way,

negative_words = {
    'no',
    'not',
    'none',
    'neither',
    'never',
    'nobody',
    'nothing',
    'nowhere',
    'doesn't',
    'isn't',
    'wasn't',
    'shouldn't',
    'won't',
    'can't',
    'couldn't',
    'don't',
    'haven't',
    'hasn't',
    'hadn't',
    'aren't',
    'weren't',
    'wouldn't',
    'daren't',
    'needn't',
    'didn't',
    'without',
    'against',
    'negative',
    'deny',
    'reject',
    'refuse',
    'decline',
    'unhappy',
    'sad',
    'miserable',
    'hopeless',
    'worthless',
    'useless',
    'futile',
    'disagree',
    'oppose',
    'contrary',
    'contradict',
    'disapprove',
    'dissatisfied',
    'objection',
    'unsatisfactory',
    'unpleasant',
    'regret',
    'resent',
    'lament',
    'mourn',
    'grieve',
    'bemoan',
    'despise',
    'loathe',
    'detract',
    'abhor',
    'dread',
    'fear',
    'worry',
    'anxiety',
    'sorrow',
    'gloom',
    'melancholy',
    'dismay',
    'disheartened',
    'despair',
    'dislike',
    'aversion',
    'antipathy',
    'hate',
    'disdain'
}
nltk.download('stopwords')
stop_words = set(stopwords.words('english'))
def remove_stopwords(sentence, stopwords_list):
    tokens = nltk.word_tokenize(sentence)
    filtered_tokens = [word for word in tokens if word.lower() not in stop_words ]
    return ' '.join(filtered_tokens)

I wrote such a code myself. Maybe this is useful for you.