I'm trying to get the raw protobuf messages out of a file created by a program that uses protobuf. I don't own the source program or anything but I'd be happy with the output from protoc --decode_raw
. Unfortunately this doesn't work as I get the "Failed to parse input" error. I believe this is because there is a header to the protobuf data in the file. The source program is DOTA2, and the files start like this;
50 42 44 45 4d 53 32 00 f0 54 0e 03 f7 bf 0d 03
01 ff ff ff ff 0f 7b 0a 08 50 42 44 45 4d 53 32
00 10 2c 1a 2e 56 61 6c 76 65 20 44 6f 74 61 20
32 20 45 55 20 4e 6f 72 74 68 20 53 65 72 76 65
72 20 28 73 72 63 64 73 31 32 35 2e 31 38 35 2e
36 34 29 22 0d 53 6f 75 72 63 65 54 56 20 44 65
6d 6f 2a 05 73 74 61 72 74 32 1f 2f 6f 70 74 2f
73 72 63 64 73 2f 64 6f 74 61 2f 64 6f 74 61 5f
76 31 38 33 31 2f 64 6f 74 61 38 02 40 01 48 01
52 00 08 ff ff ff ff 0f 11 08 01 10 01 1a 0b 44
02 82 e8 01 08 00 0a 00 0c 00 08 ff ff ff ff 0f
16 08 02 10 02 1a 10 d3 34 28 14 cc d1 85 c9 d1
41 e0 b3 46 47 06 20 08 ff ff ff ff 0f e7 03 08
03 10 03 1a e0 03 98 70 0f 20 d4 26 40 10 60 04
80 04 c0 b0 f5 e0 59 d1 48 40 01 61 91 17 80 01
b4 25 22 22 f4 c8 7d bc bc c1 d1 bd cc c9 8d 91
cd bd 90 bd d1 85 bd 90 bd d1 85 7d d9 c5 e0 cc
c4 bc 90 bd d1 85 e9 15 cc d1 85 c9 d1 29 06 20
4c bd d5 c9 8d 95 51 59 49 06 00 68 06 ac 06 20
I have no knowledge or experience using protobuf before trying to decode this file so I'm a bit unsure where the header ends and the protobuf message begins. I'm pretty confident that upto "..v1831/dota8" is part of the header but deleting this still gives me the failed to parse input issue.
I've looked all over the net at any specifics about this type of file (it's a DOTA2 demo download) and there are other programs people have made that do this sort of task, but I can't find clear cut information on the header length. I'm using this in part to learn about protobuf so using another of the applications isn't really what I'm looking for.
For reference I intend to eventually get this working in vb.net and hence I'm using protobuf-net (I don't this it's relevent to the question/answer but it's here just in case).
The first 33 bytes are the header. The header starts and ends with the 8-byte sequence "PBDEMS2\0" (including the \0, aka NUL byte). The protobuf data starts immediately after the second "PBDEMS2\0" (the byte after the NUL byte).
You only provided the beginning of the data, so when I try to feed it to
protoc --decode_raw
it still errors out because the data ends prematurely. But decoding manually the data seems to start out like this:However, this is suspicious: Notice that the field numbers were increasingly sequentially and then suddenly started back from 1. Also notice that field 2 has a different type the second time. I think what we're seeing here is actually multiple messages -- probably of different types -- back-to-back.
Unfortunately it's hard to say how you're supposed to tell where one message ends and the next begins. Probably that header tells you, but I don't know how to decode it. The header doesn't seem to be protobuf-format.