google.protobuf validation - python

864 Views Asked by At

I am trying to implement a validation for protobuf files in Python. I don't want to use an external package.

I tried:

desc_set = descriptor_pb2.FileDescriptorSet()
descriptor_bytes = str.encode(descriptor)
desc_set.ParseFromString(descriptor_bytes)

pool = descriptor_pool.DescriptorPool()
desc = descriptor_pb2.FileDescriptorProto()
descriptor_bytes = str.encode(descriptor)
desc.ParseFromString(descriptor_bytes)
  
for fd in desc_set.file:
    pool.Add(fd)
proto_msg = MessageFactory(pool).GetPrototype(pool.FindMessageTypeByName(desc_set.file[0].package))
proto_msg.FromString(bytearray(b'\n\x05Adnan'))

in the last line I defined proto file:

syntax = "proto2";

package employees;

message Employees {
    required string Name = 1;
    required int32 age = 2;
}

I sent bytearray(<msg>) to FromString() method

I expect to get parse error since the age field is missing in the and in the proto file the field is required.

How can I get parse error in this case?

1

There are 1 best solutions below

0
On

Your code was non-reproducible and didn't work for me as-was:

Python 3.10.6

grpcio==1.50.0
grpcio-tools==1.50.0
protobuf==4.21.9

However:

Renaming (!) the field to name (per the style guide):

syntax = "proto2";

package employees;

message Employees {
    required string name = 1;
    required int32 age = 2;
}

And:

python3 \
-m grpc_tools.protoc \
--proto_path=${PWD} \
--python_out=${PWD} \
$PWD/employees.proto

python3 \
-m grpc_tools.protoc \
--include_imports \
--include_source_info \
--proto_path=${PWD} \
--descriptor_set_out=${PWD}/employees.pb \
$PWD/employees.proto

And:

import google.protobuf.descriptor_pb2 as descriptor_pb2
import google.protobuf.descriptor_pool as descriptor_pool
from google.protobuf.message_factory import MessageFactory

import employees_pb2

# Create a new Employees
e1 = employees_pb2.Employees(
    name="Adnan",
    age=21,
)

# Serialize
s = e1.SerializeToString()
print(s)

# Create by parsing serialized string
e2 = employees_pb2.Employees()
e2.ParseFromString(s)
print(e2)

# Create by parsing example serialized string
# No error is thrown
e3 = employees_pb2.Employees()
e3.ParseFromString(b'\n\x05Adnan')
print(e3)

Yields:

b'\n\x05Adnan\x10\x15'

name: "Adnan"
age: 21

name: "Adnan"

NOTE ParseFromString throws no error here (either)

I'm unfamiliar with using Python to work with Descriptors:

with open("employees.pb", mode="rb") as file:
    descriptor = file.read()

    desc_set = descriptor_pb2.FileDescriptorSet()
    desc_set.ParseFromString(descriptor)
    pool = descriptor_pool.DescriptorPool()
    
    for fd in desc_set.file:
        pool.Add(fd)

    employees = MessageFactory(pool).GetPrototype(
        pool.FindMessageTypeByName("employees.Employees"))
    e4 = employees()
    e4.ParseFromString(b'\n\x05Adnan')
    print(e4)

Yields:

name: "Adnan"

NOTE ParseFromString throws no error here (either)

It seems that ParseFromString doesn't validate.