iOS UTF7 encoding/decoding

627 Views Asked by At

I have an issue with UTF7 decoding. I was able to isolate the problem, creating the following sample code:

NSStringEncoding stringEncoding = myFunctionForTranslateCodepageToEncoding(codePage);
// see the end of the string, it's important
const char * testBuffer ="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa+ADw-";

NSString * testString = [[NSString alloc] initWithBytes:testBuffer length:strlen(testBuffer) encoding:stringEncoding];

Where:

strlen(testBuffer) is 508,

'codePage' is 65000,

'stringEncoding' is 2214592768 (probably UTF-7, as expected, but I can't find clear confirmation…).

'+ADw-' is UTF7 sequence for '<'.

In this example the testString is always nil, so the conversion fails. But here are the strange things:

  1. When I remove just one 'a' from the testBuffer, the conversion works, the testString is created properly. When I add one or more 'a', it doesn't work.
  2. When I 'damage' the utf7 encoded symbol at the end (the only one in this example, '+ADw-'), it works fine. I can change it to '.ADw-' or '+ADw.' and the buffer is converted properly. Of course, the 'damaged' symbol is not decoded, it's just written literally but the conversion works. It produces "…aaaaa.ADw-" in NSString. I can also cut the buffer by 1, so I'll have "…aaaaa+ADw" and it will also be converted properly (as the UTF7 symbol is incomplete).
  3. When I add any ASCII character at the end of the buffer, after the UTF7 symbol, it works. So I.e. "…aaaaa+Adw-a" is converted into NSString "…aaa>a".
  4. When the buffer contains more UTF7 symbols, the length when it starts failing changes. So it's not always 508 or more characters.
  5. I can use any other UTF7 symbol at the end. It doesn't matter.

I've also tried to replace initWithBytes: method with initWithCString. I didn't check all the possible cases, but in all tested ones it behaves the same like initWithBytes:. I've performed my tests on iOS 6.0.

Do you have any ideas how to properly deal with UTF7 encoded strings?

0

There are 0 best solutions below