How to extract feature in Hindi Word Sense Disambiguation task

274 Views Asked by Lubna Khan At 20 November 2017 at 07:37

I am using the following data set for hindi WSD ,

एक बार वीरगढ़ राज्य की महारानी का हार कहीं खो गया । महारानी को हार बहुत प्रिय था । उन्होंने हार ढूंढने की बहुत कोशिश की पर वह नहीं मिला । हार के लिए महारानी को बहुत परेशान देखकर राजा ने घोषणा करवा दी कि जिस व्यक्ति को भी हार मिला हो, वह तीन दिनों के भीतर उसे वापस कर दे अन्यथा उसे मृत्युदंड का भागी होना पड़ेगा । यह संयोग था कि हार एक संन्यासी को मिला था । उसके मन में हार के प्रति कोई आकर्षण नहीं था, फिर भी उसने यह सोचकर रख लिया कि कोई ढूंढता हुआ आएगा तो उसे दे देगा । उसने अगले दिन राजा की घोषणा सुनी, पर वह हार देने नहीं गया । वह अपनी साधना में लीन रहा । तीन दिन बीत गए । चौथे दिन संन्यासी हार लेकर राजा के पास पहुंचा । राजा को जब पता चला कि तीन दिनों से हार उसके पास था, तो उसने क्रोधित होकर पूछा, ‘क्या तुमने मेरी घोषणा नहीं सुनी थी?’ संन्यासी ने जवाब दिया ‘सुनी थी, पर यदि मैं कल हार लौटाने आ जाता तो लोग कहते कि एक संन्यासी होकर मृत्यु से भयभीत हो गया ।’ इस पर राजा ने पूछा, ‘तो आज चौथे दिन क्यों लाए?’ इस पर संन्यासी ने कहा, ‘मुझे मौत का भय नहीं है । पर मैं किसी दूसरे की संपत्ति को अपने पास रखना पाप समझता हूं । हार जैसी तुच्छ चीज से मुझे कोई लगाव नहीं ।’ यह उत्तर सुनकर राजा लज्जित हो गया । महारानी को भी अपनी गलती का अहसास हुआ । उसने हार बेचकर वह राशि गरीबों में बंटवा दी ।

न्यूयॉर्क । हीरे का हार पहनी एक बार्बी गुडिया न्यूयॉर्क में रेकॉर्ड कीमत में नीलाम हुई है । अपनी तरह की ये अकेली बार्बी डॉल काला लिबास पहने हुई है और उसके गले में एक कैरेट का चौकोर गुलाबी हीरे का हार है । ये गुडिया में बनाया गया था और तबसे लेकर आज तक इसका रूप कई बार बदला है । सबसे बडी नीलामी का रेकॉर्ड बनाने वाली बार्बी गुडिया को ऑस्ट्रेलिया के एक गहनों के डिजायनर स्टीफानो कैन्टुरी ने बनाया है ।

and my Question is how to extract feature from this sample dataset by using "Local context and collocation context"...Here ambiguous word is हार (necklace)..How to get two words from left and two words from right of the ambiguous...In the hindi wordnet, there are 2 senses of the word हार ... I am using Anaconda python --jupyter environment..

My code is here

#****************Word Sense Disambiguation in Hindi Language**********************
#*****************Tokenization and Stop Word removal******************************
import nltk
filename = "C:/Users/Lubna Khan/My-WSD/हार/ContextSenses002.txt"
file = open(filename, "r+", encoding="utf-16")
DisplayTextF = file.read()
#print(DisplayTextF)
tokens = nltk.word_tokenize(DisplayTextF)
#print(tokens)
token = [w for w in tokens]
#reading stop-word file which is in hindi text (Devnaagri script)
filename = "C:/Users/Lubna Khan/My-WSD/HindiStopWords.txt"
file = open(filename, "r+", encoding="utf-16") 
sw = file.read()
sw_token = nltk.word_tokenize(sw)
stop_words = [w for w in sw_token]
filtered_sentence = []
for w in token :
    if w not in stop_words :
        filtered_sentence.append(w)
print(filtered_sentence)
#*************Feature Extraction***************

Please help me.. Thanks in advance

Hindi WSD code in Python

Hindi_Stop_word_Removal+Tokenization

Original Q&A

How to extract feature in Hindi Word Sense Disambiguation task

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in WORDNET

Related Questions in HINDI

Related Questions in WSD

Trending Questions

Popular # Hahtags

Popular Questions