I want to calculate size of my Message objects in my project. I use the protobuf protocol for serializing and receiving records. In calculation I get a problem. I want to calculate size of Strings after deserializing. protobuf uses this method to read String
/**
* Read a {@code string} field value from the stream.
* If the stream contains malformed UTF-8,
* replace the offending bytes with the standard UTF-8 replacement character.
*/
public String readString() throws IOException {
final int size = readRawVarint32();
if (size <= (bufferSize - bufferPos) && size > 0) {
// Fast path: We already have the bytes in a contiguous buffer, so
// just copy directly from it.
final String result = new String(buffer, bufferPos, size, "UTF-8");
bufferPos += size;
return result;
} else if (size == 0) {
return "";
} else {
// Slow path: Build a byte array first then copy it.
return new String(readRawBytesSlowPath(size), "UTF-8");
}
}
After reading about UTF-8, character and String in Java, I get confused. My contents is just ASCII for example numbers and standard latin characters. According to this how many bytes each character of my Strings consumes memory?