If I have chinese word list: like reference = ['我', '是', '好' ,'人'], hypothesis = ['我', '是', '善良的','人] . Could I use the: nltk.translate.bleu_score.sentence_bleu(references, hypothesis) for chinese translation? it is the same as English? How about Japanese? I mean If I have word list(chinese and japanese ) like english. Thanks!
BLEU scores:could I use nltk.translate.bleu_score.sentence_bleu for calculating scores of bleu in chinese
11.3k Views Asked by tktktk0711 At
1
There are 1 best solutions below
Related Questions in PYTHON-2.7
- Initialize matrix
- Why is my program adding int as string (4+7 = 47)?
- How to save gensim LDA topics output to csv along with the scores?
- Update a text file with ( new words+ \n ) after the words is appended into a list
- Removing URL features from tokens in NLTK
- python, global name not defined
- Why does collections.OrderedDict use try and except to initialize variables?
- Invalid URL: No host supplied : error while using Request.get(url) in Python
- Python GUI application to copy files one location to another location
- Why I receive CERTIFICATE_VERIFY_FAILED from google adwords api?
- Excel worksheet to Numpy array
- Python datetime.now() with timezone
- local variable referenced before assignment in strange condition
- Python 2.7 - find combinations of numbers in a list that add to another number
- Can't install anything with pip2 on Windows 7 due to UnicodeDecodeError
Related Questions in NLTK
- Removing URL features from tokens in NLTK
- Django webapp (on an Apache2 server) hangs indefintely when importing nltk in views.py
- Stanford Entity Recognizer (caseless) in Python Nltk
- How to Train an Input File containing lines of text in NLTK Python
- Python child process silently crashes when issuing an HTTP request
- 'NoneType' object has no attribute 'kill_cursors' when nltk is imported
- NLTK - Get and Simplify List of Tags
- Check if items in list a are found in list b and return list c with matching indexes of list b in Python
- Extract word from a list of synsets in NLTK for Python
- Python NLTK pos_tag not returning the correct part-of-speech tag
- Using WordNet-Affect with NLTK
- Check the similarity between two words with NLTK with Python
- How to remove a custom word pattern from a text using NLTK with Python
- Printing Simplified Corpus to Json File
- NLTK: Package Errors? punkt and pickle?
Related Questions in BLEU
- Pyter not working when written as a Python Program
- "ZeroDivisionError: Fraction(0, 0)" when computing the BLEU with NLTK
- What is the difference between mteval-v13a.pl and NLTK BLEU?
- Understanding ROUGE vs BLEU
- Bug report nltk.translate.bleu_score stopped working on tokens less than or equal to 3
- What are the differences between BLEU score and METEOR?
- why the bleu score is zero for this pair even though they are similar
- BLEU score value higher than 1
- Running NLTK sentence_bleu in Pandas
- Calculating BLEU and Rouge score as fast as possible
- Split several sentences in pandas dataframe
- Early stopping based on BLEU in FairSeq
- cannot compute __inference_pruned_8945 as input #0(zero-based) was expected to be a int64 tensor but is a int32 tensor [Op:__inference_pruned_8945]
- Derive BLEU score for corpus from sentence-level scores
- Are there some open translation data (include reference data and candidate data )to calculate BLEU score?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
TL;DR
Yes.
In Long
BLEU score measures n-grams and its agnostic to languages but its dependent on the fact the language sentences can be split into tokens. So yes, it can compare Chinese/Japanese...
Note the caveats of using BLEU score at sentence level. BLEU was never created with sentence level comparison in mind, here's a nice discussion: https://github.com/nltk/nltk/issues/1838
Most probably, you'll see the warning when you have really short sentences, e.g.
You can use the smoothing functions in https://github.com/alvations/nltk/blob/develop/nltk/translate/bleu_score.py#L425 to overcome short sentences.