What should be a type annotation for dataclass descriptor fields?

30 Views Asked by At

I'm working on a class for which user should be able to set its fields in the most convenient way, which includes assigning strings to any of the fields. Values assigned by the user should be automatically converted actual data type (so for example "2022-01-02" assigned to a field date should be converted to datetime.date object).

For this I chose a descriptor-typed fields approach of Python dataclasses module. To avoid unnecessary and/or unsupported conversions I inspect __annotations__ to determine whether it's ok to assign the user-provided value without a conversion.

from typing import Optional

import datetime
from dataclasses import dataclass
from datetime import date
from decimal import Decimal


class Conversion:
    def __init__(self, *, conv, default=None):
        self._conv = conv
        self._default = default
        self._name = None
        self._prop = None

    def __set_name__(self, owner, name):
        self._prop = name
        self._name = "_" + name

    def __get__(self, obj, tp):
        # dataclasses determines default value by calling
        # descriptor.__get__(obj=None, tp=cls)
        if obj is None:
            return self._default

        return getattr(obj, self._name, self._default)

    def __set__(self, obj, value):
        tp = obj.__annotations__.get(self._prop)

        # Don't convert values which already match desired type
        if tp and isinstance(value, tp):
            setattr(obj, self._name, value)
        else:
            try:
                val = self._conv(value)
            except:
                raise ValueError(
                    f"Conversion error for '{self._name.lstrip('_')}': {value}"
                )

            setattr(obj, self._name, val)


@dataclass
class Entry:
    date: datetime.date = Conversion(conv=date.fromisoformat, default=date.today())
    amount: Optional[Decimal] = Conversion(conv=Decimal, default=None)


e = Entry()
print(e)
e.date = "2022-02-05"
e.amount = "11.02"
print(e)

And output is, as expected:

Entry(date=datetime.date(2024, 3, 7), amount=None)
Entry(date=datetime.date(2022, 2, 5), amount=Decimal('11.02'))

This works beautifully and I feel that this is very clean and elegant solution, but I noticed that documentation always annotates descriptor-typed fields with the type of descriptors, not the underlying data type. For me this would be e.g. date: Conversion = Conversion(...). Is there a reason why dataclasses authors chose to do it this way and am I wrong to annotate fields with data types?

1

There are 1 best solutions below

0
Oskar Hofmann On

Purely programmatically, in your example, the default type of date is Conversion, i.e. the annotation Conversion is correct. A linter like mypy would complain about your type annotation.

Then again, type annotations are purely optional and not enforced so you can do what you want here. Personally, I would avoid having a descriptor with the very unspecific name "Conversion" that does different things for different data fields. Especially when the underlying data type is different.