I am using Protobuf (v3.5.1) in a Python project I'm working on. My situation can be simplified to the following:
// Proto file
syntax = "proto3";
message Foo {
Bar bar = 1;
}
message Bar {
bytes lotta_bytes_here = 1;
}
# Python excerpt
def MakeFooUsingBar(bar):
foo = Foo()
foo.bar.CopyFrom(bar)
I am worried about the memory performance of .CopyFrom()
(If I am correct, it is copying contents, instead of the reference). Now, in C++, I could use something like:
Foo foo;
Bar* bar = new Bar();
bar->set_lotta_bytes_here("abcd");
foo.set_allocated_bar(bar);
Which looks like it does not need to copy anything judging by the generated source:
inline void Foo::set_allocated_bar(::Bar* bar) {
::google::protobuf::Arena* message_arena = GetArenaNoVirtual();
if (message_arena == NULL) {
delete bar_;
}
if (bar) {
::google::protobuf::Arena* submessage_arena = NULL;
if (message_arena != submessage_arena) {
bar = ::google::protobuf::internal::GetOwnedMessage(
message_arena, bar, submessage_arena);
}
} else {
}
bar_ = bar;
// @@protoc_insertion_point(field_set_allocated:Foo.bar)
}
Is there something similar available in Python? I have looked through the Python generated sources, but found nothing applicable.
When it comes to large
string
orbytes
objects, it seems that Protobuf figures the situation fairly well. The following passes, which means that while a newBar
object is created, the binary array is copied by reference (Pythonbytes
are immutable, so it makes sense):This solves my issue of large
bytes
object. However, if someone's problem lies in nested, or repeated fields, this will not help - such fields are copied field by field. It does make sense - if one copies a message, they want the two to be independent. If they were not, making changes to the original message would modify the copied (and vice versa).If there is anything akin to the C++ move semantics (https://github.com/google/protobuf/issues/2791) or
set_allocated_...()
in Python protobuf, that would solve it, however I am not aware of such a feature.