Converting Language Detection Score of CLD2 to CLD3 Accuracy

1.2k Views Asked by At

My cld2 language detection model (langID) returns for the input sentence to classify the following values

{ reliable: true,
  textBytes: 181,
  languages: 
   [ { name: 'ITALIAN', code: 'it', percent: 61, score: 774 },
     { name: 'ENGLISH', code: 'en', percent: 38, score: 1573 } ],
  chunks: 
   [ { name: 'ITALIAN', code: 'it', offset: 0, bytes: 116 },
     { name: 'ENGLISH', code: 'en', offset: 116, bytes: 71 } ] }

where the textBytes represents the size of the input text, percent the distribution of the code in the sentence, while the score is an indicator of the quality of the detection (the smaller it is the best it is). That said, in the brand new CLD3 neural network, the result of the classification is just the accuracy (so a probability value between 0 and 1) so like

 println(ld.getCode(0))
 println(ld.getScore(0))

en
0.99

I would like to figure out how to convert CLD2 score to probabilities values in order to compare the results to the new CLD3 model.

0

There are 0 best solutions below