I have recently taken on a task where it was suggested to use Protobuf to serialize an object to be written out as a base64 string. This would be protobuf-net at the moment for the .NET port. The prior methods of storing this data was a series of bit masking but that has been outgrown and this was the suggested route. Unfortunately the data written out from this approach is simply too large for my purposes.
In code the object that I am serializing looks like this. I've tried both using decorated POCOs and generated classes by ProtoGen. ProtoGen generated classes actually serialized into less optimized data.
Obj
- Time
- List of an objects. A pair is what must be recorded.
Pair = [Key | Time]
Looking at the output and the way the size grows as the list length grows I'm thinking some of the size is coming from storing type information about the class type. I tried to see how this would go with storing the pairs in parallel arrays and using "Packed" however I'm not seeing much a size improvement. Maybe 10%-15%. As it stands this is an order of magnitude larger than the previous method of data storage however that old method won't work as we are running out of key space.
My question is, besides simply making the key space larger by adding a few more bits here in the old method, is there a way to optimize Protobuf for size that I may be missing? Or perhaps an alternative for serializing rather simple objects optimized for size?
I haven't tried it yet, but from what I'm reading, even GZipping the current data would only return marginal improvements. I'll be benching that as an option next.
Sample class:
[ProtoContract]
public class Foo : BaseOfFoo
{
[ProtoMember(1)]
public UInt32 Time { get; set; }
[ProtoMember(2)]
public List<ValuePair> KeywordValues { get; private set; }
}
[Serializable]
[ProtoContract]
public class ValuePair
{
[ProtoMember(1)]
public UInt32 Id { get; set; }
[ProtoMember(2)]
public UInt32 Time { get; set; }
}