Why doesn't bincode detect an error if I deserialize data into the wrong type?

2k Views Asked by At

Why don't I get an error from bincode when I try to deserialize binary data into the wrong type?

use bincode; // 1.3.1
use serde::{Deserialize, Serialize}; // { version = "1.0", features = ["derive"] }

#[derive(Serialize, Deserialize, Copy, Clone, Debug)]
pub struct Ping {
    pub pinger_id: usize,
    pub useless_field: usize,
    pub i_need_one_more: usize,
}

#[derive(Serialize, Deserialize, Copy, Clone, Debug)]
pub struct Heartbeat {
    pub term: usize,
    pub node_id: usize,
}

#[derive(Serialize, Deserialize, Clone, Debug)]
pub enum Message {
    Heartbeat(Heartbeat),
    Ping(Ping),
}

fn main() {
    let rpc_message_bin = bincode::serialize(&Ping {
        pinger_id: 0,
        useless_field: 1,
        i_need_one_more: 2,
    })
    .unwrap();
    let m: Message = bincode::deserialize(&rpc_message_bin).unwrap();

    println!("{:#?}", m);
}

I was expecting to get a Message::Ping but I get:

Heartbeat (
    Heartbeat {
        term: 4294967296,
        node_id: 8589934592,
    },
)
1

There are 1 best solutions below

4
On

bincode trust the user to deserialize into the expected type, what you are doing have "random" result, it's safe but it's implementation behavior.

The following is simply an example, this COULD be wrong, but the logic is correct. enum in rust are implementation behavior, bincode is "abusing" rust by assuming an enum is always represented with a unsigned integer value, bincode also choice to encode it as u32 value "enums variants are encoded as a u32 instead of a usize. u32 is enough for all practical uses.". This is not important from a user point of view (except the "limitation" of enum with max 2**32 variant...).

So, this is how bincode do. In your code, you are asking bincode to encore a Ping structure, NOT the variant Message::Ping.

This mean the encode buffer will contain 3 usize like Ping structure. Then you ask bincode to interpret this data as a Message enum, basically this will ask bincode to read from the buffer a u32, in this example this would result by reading 0, and this happen to be the number rust and bincode used to represent the first variant of Message enum. So bincode will think "ok I'm reading a Message::Heartbeat then bincode will read 2 more usize to fill up the Heartbeat structure. Like reading a u32 in a 64 bit system will introduce an offset of 4 octets, bincode will not read 1 and 2 but 1 << 32 and 2 << 32.

This mean that in the encoded buffer you have something like that

[0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0]
 ^ first usize        $  ^ second usize       $  ^ last usize         $
 ^ u32    $  ^ first usize       $  ^ second usize       $

From the point of view of bincode this is perfectly valid. bincode is mean to be using with reader, and so the reader cursor would still have 4 octets left to read.

We can play a bit, If you change a little bit the encoded value pinger_id: usize::MAX, you would have an error message:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Custom("invalid value: integer `4294967295`, expected variant index 0 <= i < 2")', src\main.rs:31:61

We could also play by changing the first usize from Ping to u32 doing:

#[derive(Serialize, Deserialize, Copy, Clone, Debug)]
pub struct Ping {
    pub pinger_id: u32,
    pub useless_field: usize,
    pub i_need_one_more: usize,
}

and now encoding with these values:

    let rpc_message_bin = bincode::serialize(&Ping {
        pinger_id: 0,
        useless_field: 1,
        i_need_one_more: 2,
    })

would result into having 1 and 2:

Heartbeat(
    Heartbeat {
        term: 1,
        node_id: 2,
    },
)

if Ping structure is too small:

#[derive(Serialize, Deserialize, Copy, Clone, Debug)]
pub struct Ping {
    pub pinger_id: usize,
}

bincode would error saying there is missing data:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Kind(UnexpectedEof))', src\main.rs:27:61

So, in summary you must not send the "direct" value of an variant if you deserialize it into an enum type. When using bincode or any serializing tool, you must always match the type you encoded with the type you decode and so you MUST serialize an Message::Ping(Ping{ .. }) not a Ping { .. } directly.