I tried serializing instances of Byte and Integer and was shocked by how much space they took up when they were received on the other end. Why is it that it only takes 4 bytes to make an Integer, but it takes up over 10 times that many bytes upon serialization? I mean in C++, a final class has a 64 bit class identifier, plus its contents. Going off that logic, I would expect an Integer to take up 64 + 32, or 96 bits when serialized.
import java.io.*;
public class Test {
public static void main (String[] ar) throws Exception {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutput out = new ObjectOutputStream(bos);
out.writeObject(new Integer(32));
byte[] yourBytes = bos.toByteArray();
System.out.println("length: " + yourBytes.length + " bytes");
}
}
Output:
length: 81 bytes
Update:
public static void main(String[] args) throws IOException {
{
ByteArrayOutputStream bos1 = new ByteArrayOutputStream();
ObjectOutput out1 = new ObjectOutputStream(bos1);
out1.writeObject(new Boolean(false));
byte[] yourBytes = bos1.toByteArray();
System.out.println("1 Boolean length: " + yourBytes.length);
}
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutput out = new ObjectOutputStream(bos);
for (int i = 0; i < 1000; ++i) {
out.writeObject(new Boolean(true)); // 47 bytes
}
byte[] yourBytes = bos.toByteArray();
System.out.println("1000 Booleans length: " + yourBytes.length); // 7040 bytes
final int count = 1000;
ArrayList<Boolean> listBoolean = new ArrayList<>(count);
listBoolean.addAll(Collections.nCopies(count, Boolean.TRUE));
System.out.printf("ArrayList: %d%n", sizeOf(listBoolean)); // 5096 bytes
Boolean[] arrayBoolean = new Boolean[count];
Arrays.fill(arrayBoolean, true);
System.out.printf("Boolean[]: %d%n", sizeOf(arrayBoolean)); // 5083 bytes
boolean[] array = new boolean[count];
Arrays.fill(array, true);
System.out.printf("boolean[]: %d%n", sizeOf(array)); // 1027 bytes
BitSet bits = new BitSet(count);
bits.set(0, count);
System.out.printf("BitSet: %d%n", sizeOf(bits)); // 201 bytes
}
static int sizeOf(Serializable obj) throws IOException {
ByteArrayOutputStream bytesOut = new ByteArrayOutputStream();
ObjectOutputStream objsOut = new ObjectOutputStream(bytesOut);
objsOut.writeObject(obj);
return bytesOut.toByteArray().length;
}
Output:
1 Boolean length: 47 (47 bytes per Boolean)
1000 Booleans length: 7040 (7 bytes per Boolean)
ArrayList: 5096 (5 bytes per Boolean)
Boolean[]: 5083 (5 bytes per Boolean)
boolean[]: 1027 (1 bytes per boolean)
BitSet: 201 (1/5 of 1 byte per boolean)
Though Radiodef has clarified why the size of the serialized object is huge, i would like to make another point here so we don't forget the optimization present in the underlying java's serialization algorithm (almost in all algorithms).
When you write another Integer object (or any object which is already written), you would not see similar size (i mean the size would not be 81 * 2 = 162 bytes) in this case,
The way it works is that, when an instance (object) of class is requested for serialization for the first time, it writes the information about the whole class. i.e including class name, it writes the name of each fields present in the class. That's why the number of bytes are more. This is basically to handle the class evaluation cases properly.
While it sends the meta data of the class for first time, it also caches the same information into the local cache called value-cache or indirection table. So next time when another instance of same class is requested for serialization (remember the cache is applicable only at stream level, or before reset() is called), it just writes only a marker (just 4 bytes of information) so that the size would be less.