I am working on an OCR-task and for evaluation purposes want to calculate a confusion matrix for my model. I want it to basically show how often a character is predicted correctly and how often it is predicted as other characters (and which ones!).
My problem currently is, that a simple pair-wise comparison is difficult due to string-size mismatches and/or additional/missing characters (mainly whitespaces). I was thinking about adding the information about how often a character would need to be inserted/deleted using the Levenshtein distance calculation algorithm, but I'm still unsure on how to handle that.
Are there any state-of-the-art approaches that are commonly used for this? I did some research, but couldn't find anything significant.