Extract positive and negative words from text?

2.7k Views Asked by At

I need to find the opinion of certain reviews given in websites. I am using sentiwordnet for this. I first send the file containing all the reviews to POS Tagger.

Is there any other accurate way of tokenizing which considers not good as 1 word other than considering it as 2 separate words.

Now I have to give postive and negative score to the tokenized words and then calculate the total score. Is there any function in sentiwordnet for this. please help.

import nltk
from not.tokenize import sent_tokenize, word_tokenize
import CSV

para = "What can I say about this place. The staff of the restaurant is nice and the eggplant is not bad. Apart from that, very uninspired food, lack of atmosphere and too expensive. I am a staunch vegetarian and was sorely disappointed with the veggie options on the menu. Will be the last time I visit, I recommend others to avoid"

sentense = word_tokenize(para)
word_features = []

for i,j in nltk.pos_tag(sentense):
    if j in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']:
        word_features.append(i)

rating = 0

for i in word_features:
    with open('words.txt', 'rt') as f:
        reader = csv.reader(f, delimiter=',')
        for row in reader:
            if i == row[0]:
                print i, row[1]
                if row[1] == 'pos':
                    rating = rating + 1
                elif row[1] == 'neg':
                    rating = rating - 1
print  rating

Error:

Traceback (most recent call last):
  File "E:/Emotional from text/pORnOfWord.py", line 10, in <module>
    for i,j in nltk.pos_tag(sentense):
  File "C:\Python27\lib\site-packages\nltk\tag\__init__.py", line 99, in pos_tag
    tagger = load(_POS_TAGGER)
  File "C:\Python27\lib\site-packages\nltk\data.py", line 605, in load
    resource_val = pickle.load(_open(resource_url))
  File "C:\Python27\lib\site-packages\nltk\data.py", line 686, in _open
    return find(path).open()
  File "C:\Python27\lib\site-packages\nltk\data.py", line 467, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource 'taggers/maxent_treebank_pos_tagger/english.pickle' not
  found.  Please use the NLTK Downloader to obtain the resource:
  >>> nltk.download()
  Searched in:
    - `enter code here`'C:\\Users\\Eman\x99/nltk_data'
    - 'C:\\nltk_data'
    - 'D:\\nltk_data'
    - 'E:\\nltk_data'
    - 'C:\\Python27\\nltk_data'
    - 'C:\\Python27\\lib\\nltk_data'
    - 'C:\\Users\\Eman\x99\\AppData\\Roaming\\nltk_data'
1

There are 1 best solutions below

0
On

The error has occurred due to missing nltk packages, which will be resolved by downloading the package, execute the below code to resolve the issue

import nltk 
nltk.download()