The current design
I am refactoring some exiting API code that returns a feed of events for a user. The API is a normal RESTful API, and the current implementation simply queries a DB and returns a feed.
The code is long and cumbersome, so I've decided to move the feed generation to a microservice that will be called from the API server.
The new design
For the sake of decoupling, I thought that the data may move back and forth from the API server to the microservice as Protobuf objects. This way, I can change the programming language on either end and still enjoy the type safety and slim size of protobuf.
The problem
The feed contains multiple types (e.g. likes, images and voice messages). In the future, new types can be added. They all share a few properties timestamp and title, for instance - but other than that they might be completely different.
In classic OOP, the solution is simple - a base FeedItem
class from which all feed items inherit, and a Feed
class which contains a sequence of FeedItem
classes.
How do I express the notion of Polymorphism in Protocol Buffers 3, or at least enable different types of messages in a list?
What have I checked
Oneof
: "A oneof cannot be repeated".Any
: Too broad (like Java'sList<Object>
.
The answer for serialization protocols is to use discriminator based polymorphism. Traditional Object Oriented inheritance is a form of that with some very bad characteristics. In newer protocols like OpenAPI the concept is a bit cleaner.
Let me explain how this works with proto3
First you need to declare your polymorphic types. Suppose we go for the classic animal species problem where different species have different properties. We first need to define a root type for all animals that will identify the species. Then we declare a Cat and Dog messages that extend the base type. Note that the discriminator
species
is projected in all 3:Here is a simple Java test to demonstrate how things work in practice
The whole trick is that proto3 bindings preserve properties they do not understand and serialize them as needed. In this way one can implement a proto3 cast (convert) that changes the type of an object without loosing data.
Note that the "proto3 cast" is very unsafe operation and should only be applied after proper checks for the discriminator are made. You can cast a cat to a dog without a problem in my example. The code below fails
When property types at same index match it is possible that there will be semantic errors. In the example I have where index 10 is int64 in dog or string in cat proto3 treats them as different fields as their type code on the wire differs. In some cases where type may be string and a structure proto3 may actually throw some exceptions or produce complete garbage.