How to deserialize an array of JSON objects into a HashMap<key, struct> where key is one of the fields

296 Views Asked by At

I am struggling to instruct serde to deserialize a JSON array of objects into a HashMap whose key is found from one of the fields of the object and value is a struct built out of the remaining keys of the object. Here's a simplified example:

[
  {"name" : "John", "age" : 11, "registry" : true},
  {"name" : "Clyde", "age" : 14, "registry" : false},
  {"name" : "Bob", "age" : 12, "registry" : true}
]

Now I need to deserialize the whole JSON into AllPeople with each object getting deserialized into Person:

use std::collections::HashMap;

struct AllPeople(HashMap<String, Person>);

struct Person {
    age: u32,
    registry: bool,
}

I'd like for the value associated with the "name" field of the JSON object to serve as the key for the AllPeople map.

  1. Is there a way to achieve this using any macros provided by serde?
  2. If not, how do I modify the behavior of deserialize to handle this?
2

There are 2 best solutions below

0
On BEST ANSWER

You can do this by specifying a custom deserialization function in your wrapper via #[serde(deserialize_with = "mappify")]:

use serde::Deserializer;
use serde::Deserialize;
use std::collections::HashMap;

#[derive(Debug)]
struct Person {
    age: u32,
    registry: bool,
}
#[derive(Deserialize)]
struct JsonPerson {
    age: u32,
    registry: bool,
    name: String,
}

#[derive(Deserialize, Debug)]
struct AllPeople(#[serde(deserialize_with = "mappify")] HashMap<String, Person>);

fn mappify<'de, D>(de: D) -> Result<HashMap<String, Person>, D::Error>
where
    D: Deserializer<'de>,
{
    use serde::de::*;
    struct ItemsVisitor;
    impl<'de> Visitor<'de> for ItemsVisitor {
        type Value = HashMap<String, Person>;

        fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
            formatter.write_str("a sequence of items")
        }

        fn visit_seq<V>(self, mut seq: V) -> Result<HashMap<String, Person>, V::Error>
        where
            V: SeqAccess<'de>,
        {
            let mut map = HashMap::with_capacity(seq.size_hint().unwrap_or(0));

            while let Some(item) = seq.next_element::<JsonPerson>()? {
                let JsonPerson {
                    age,
                    registry,
                    name,
                } = item;
                match map.entry(name) {
                    std::collections::hash_map::Entry::Occupied(entry) => {
                        return Err(serde::de::Error::custom(format!(
                            "Duplicate entry {}",
                            entry.key()
                        )))
                    }
                    std::collections::hash_map::Entry::Vacant(entry) => {
                        entry.insert(Person { age, registry })
                    }
                };
            }
            Ok(map)
        }
    }

    de.deserialize_seq(ItemsVisitor)
}

Playground

Note that as mentioned in the issue linked by @ChayimFriedman, the performance benefits are questionable and probably not worth the wordiness. Possibly, the Vec is even faster, because it doesn't need to do the complicated resizing of a hashmap during adding.

Personally, I'd probably go with

#[derive(Deserialize, Debug)]
#[serde(try_from = "Vec<JsonPerson>")]
struct AllPeople(HashMap<String, Person>);

impl TryFrom<Vec<JsonPerson>> for AllPeople {
    type Error = String;

    fn try_from(value: Vec<JsonPerson>) -> Result<Self, Self::Error> {
        // This could be value.into_iter().map(…).collect(), but I care about duplicates.
        let mut map = HashMap::with_capacity(value.len());
        for item in value {
            let JsonPerson {
                age,
                registry,
                name,
            } = item;
            match map.entry(name) {
                std::collections::hash_map::Entry::Occupied(entry) => {
                    return Err(format!("Duplicate entry {}", entry.key()))
                }
                std::collections::hash_map::Entry::Vacant(entry) => {
                    entry.insert(Person { age, registry })
                }
            };
        }
        Ok(AllPeople(map))
    }
}

Playground


Side note: I feel like serde_with::serde_as should be able to do that with something like

struct AllPeople(#[serde_as(as = "Seq<JsonPerson>")] HashMap<String, Person>);

but I've only been able to get it to produce a Vec<(String, Person)>. If somebody can demonstrate serde_as, that'd be neat.

2
On

I'm not sure that its possible to serialize the data directly into the hashtable, but you should be able to do something like this:

use std::collections::HashMap;
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize, Debug)]
struct JsonPerson {
    name: String,
    age: u32,
    registry: bool,
}

#[derive(Debug)]
struct Person {
    age: u32,
    registry: bool,
}

fn main() {
    let people: Vec<JsonPerson> = serde_json::from_str("<your json string>").unwrap();
    let mut all_people: HashMap<String, Person> = HashMap::new();
    for person in people {
        let key = person.name; 
        let value = Person { age: person.age, registry: person.registry };
        all_people.insert(key, value);
    }
    println!("{:#?}", all_people);
}