How to Serialize/Deserialize a List in Pydantic While Maintaining an Internal Dict Representation?

57 Views Asked by At

I am using Pydantic in my project to define data models and am facing a challenge with custom serialization and deserialization. I have a model where I want to internally represent an attribute as a dictionary for easier access by keys, but I need to serialize it as a list when outputting to JSON and deserialize it back from a list into a dictionary when reading JSON.

Moreover, the schema should also specify this field as a list rather than a dict.

Here's a simplified version of what I'm trying to achieve:

from pydantic import BaseModel, Field
from typing import List, Dict, Union
from typing_extensions import Annotated

class MyItem(BaseModel):
   name: str = Field(...)
   data: str = Field(...)

def validate_items(items: Union[Dict[str, MyItem], List[MyItem]]) -> Dict[str, MyItem]:
    if isinstance(items, list):
        return {item.name: item for item in items}
    elif isinstance(items, dict):
        return items
    else:
        raise ValueError("Input must be a list or a dictionary")

ItemsDict = Annotated[
    dict[str, MyItem],
    PlainSerializer(
        lambda items_dict: list(items_dict.values()),
        return_type=list[MyItem],
    ),
    BeforeValidator(validate_items),
]

class MyObject(BaseModel):
     items: ItemsDict = Field(...)

# Example instantiation
obj = MyObject(items=[MyItem(name='item1', data='data1'), MyItem(name='item2', data='data2')])

# To generate and print the schema
print(MyObject.schema_json(indent=2))

the output schema still says that items is an object i.e. a dict rather than a list.

I attempted to use Pydantic's Annotated type with custom serialization and validation to convert between the list and dict representations, but I'm unsure how to properly define the serialization/deserialization logic so that:

The internal representation of items is a dictionary for easy access. When serializing MyObject to JSON, items is output as a list of MyItem instances. When deserializing from JSON, a list of MyItem instances is converted back into a dictionary, keyed by MyItem.name. Additionally, when I generate the schema with MyObject.schema_json(indent=2), the items field is still shown as an object (dict) rather than a list, which does not reflect the desired external representation.

Questions:

  1. How can I customize the serialization/deserialization in Pydantic to achieve this behavior?
  2. Is there a way to adjust the JSON schema generation in Pydantic to reflect items as a list for external interfaces, while keeping it as a dict internally?
  3. Any guidance or examples on how to implement this in Pydantic would be greatly appreciated.
0

There are 0 best solutions below