Obtaining METEOR scores for Japanese text

296 Views Asked by phil At 21 July 2021 at 19:41

I wish to produce METEOR scores for several Japanese strings. I have imported nltk, wordnet and omw but the results do not convince me it is working correctly.

from nltk.corpus import wordnet
from nltk.translate.meteor_score import single_meteor_score

nltk.download('wordnet')
nltk.download('omw')

reference = "チップは含まれていません。"
hypothesis = "チップは含まれていません。"

print(single_meteor_score(reference, hypothesis))

This outputs 0.5 but surely it should be much closer to 1.0 given the reference and hypothesis are identical?

Do I somehow need to specify which wordnet language I want to use in the call to single_meteor_score() for example:

single_meteor_score(reference, hypothesis, wordnet=wordnetJapanese.

Original Q&A

There are 1 best solutions below

phil On 22 July 2021 at 15:40

Pending review by a qualified linguist, I appear to have found a solution. I found an open source tokenizer for Japanese. I pre-processed all of my reference and hypothesis strings to insert spaces between Japanese tokens and then run the nltk.single_meteor_score() over the files.

Obtaining METEOR scores for Japanese text

There are 1 best solutions below

Related Questions in NLTK

Related Questions in NLTK-BOOK

Trending Questions

Popular # Hahtags

Popular Questions