I am writing an android app to write NFC tags, and I keep seeing examples like this:
private NdefRecord createTextRecord(String content){
try {
byte[] language;
language = Locale.getDefault().getLanguage().getBytes("UTF-8");
final byte[] text = content.getBytes("UTF-8");
final int languageSize = language.length;
final int textLength = text.length;
final ByteArrayOutputStream payload = new ByteArrayOutputStream(1 + languageSize + textLength);
payload.write((byte) (languageSize & 0x1F)); // <----- LOOK HERE
payload.write(language, 0, languageSize);
payload.write(text, 0, textLength);
return new NdefRecord(NdefRecord.TNF_WELL_KNOWN, NdefRecord.RTD_TEXT, new byte[0], payload.toByteArray());
}
catch (UnsupportedEncodingException e){
Log.e("createNdefMessage",e.getMessage());
}
return null;
}
Note the payload.write((byte) (languageSize & 0x1F));
part. What's up with that 0x1F
bitmask? At first I thought the specification would only allow for 5 bits to describe the length of the encoding, but that doesn't make sense because we're writing a whole byte anyway.
See here and here for examples of the NDEF spec. And see here, and here for more examples of this mysterious 0x1F
mask being used.
Am I missing something?
EDIT: Since I have answered my own question, and I'm not entirely sure if I am correct, if anyone else can provide a better explanation, or more insight, I will select your answer instead.
An NDEF Text Record is a version of the generic NDEF Record structure, characterized by the Type-Name-Format (TNF field) code 1 (well-known record type names assigned by the NFC Forum) and Type-Name (TYPE field) "T" (0x54).
For the NFC Forum Well-Known Type Name "T" the structure of the NDEF Record PAYLOAD is given by the "NFC Forum Text Record Type Definition" specification.
The Text Record payload consists of a status byte, followed by a variable length language code and the actual UTF-8 or UTF-16 encoded text content. The most significant bit of the status byte is 0 for UTF-8 and 1 for UTF-16 encoding. The next bit is reserved. The 6 least significant bits indicate the number of bytes occupied by the language code. The bit mask 0x1F corresponds to the 5 least significant bits of a byte and does not match the specification text. Furthermore, the subsequent line writes
languageSize
bytes without applying the same mask, thus potentially creating an incorrect NDEF Text Record where a tail part of the language code becomes part of the text content.As an example payload, the byte sequence
02656e48656c6c6f20576f726c64
starts with status byte 0x02 for the 2 byte language code "en" (0x65, 0x6e) followed by the UTF-8 encoded text "Hello World".