I am using Ruby 2.3:
I have the following string: "\xFF\xFE"
I do a File.binread() on a file containing it, so the encoding of this string is ASCII-8BIT. However, in my code, i check to see whether this string was indeed read by comparing it to the literal string "\xFF\xFE" (which has encoding UTF-8 as all Ruby strings have by default).
However, the comparison returns false, even though both strings contain the same bytes - it just happens that one is with encoding ASCII-8BIT and the other is UTF-8
I have two questions: (1) why does it return false ? and (2) what is the best way to go about achieving what i want? I just want to check whether the string I read matches "\xFF\xFE"
When comparing strings, they either have to be in the same encoding or their characters must be encodable in US-ASCII.
Comparison works as expected if the string only contains byte values 0 to 127: (
0b0xxxxxxx)And fails if it contains any byte values 128 to 255: (
0b1xxxxxxx)Your string can't be represented in US-ASCII, because both its bytes are outside its range:
Attempting to convert it doesn't produce any meaningful result:
The string will therefore return
falsewhen being compared to a string in another encoding, regardless of its content.You could compare your string to a string with the same encoding.
binreadreturns a string inASCII-8BITencoding, so you could usebto create a compatible one:or you could compare its
bytes: