Questions
- How can I serialize an instance of a data class in a way that saves its state when the state is generated by its meta class?
- See description and sample code for more clarification.
- How do I share context in python in a way that replicates the
working-case
while achieving the modularity of thesample-issue
case? - Is there an alternative way of populating data classes in an abstract way that doesn't use meta classes?
- Needs to be abstracted because I have multiple data classes that are representative of request information that are all populated in a similar fashion.
Description
I am currently writing a flask web application that passes processing of tasks off to rq workers via a redis queue (using rq
), and I am having trouble with the serialization of the data classes that contain the request information. The issue I am running into happens when rq
tries to serialize a data class that uses a meta class to populate all of its fields.
When debugging and diagnosing the issue I have noticed that dill
, and cloudpickle
are able to serialize the instance of the object correctly when the instance is created and defined in the same file/module, but once I move the definition of the object to a different file/module they are unable to serialize the object in a way that maintains the instances state.
I have added a simplified sample below to replicate my problem.
Environment
Python Version: python 3.7.3
OS: Windows
# File structure
\sample-issue
--> dataclass.py
--> serialization.py
--> deserialization.py
--> __init__.py
\working-case
--> sample.py
--> __init__.py
Sample Issue
# dataclass.py
from dataclasses import dataclass, field
from typing import List, Dict, Any
import json
from collections import defaultdict
class JSONRequest(type):
def __call__(cls, *args, **kwargs):
"""
This is a metaclass used to autonomously populate dataclasses
NOTE: This metaclass only works with dataclasses
Optional Parameter:
inp: class attribute that is a callable that produces a dictionary
"""
inp = cls.__dict__["inp"]()
cls.__initialize(inp)
return cls
def __initialize(cls, inp: dict) -> None:
"""
Initializes all of the dataclasses fields
If the field is missing in the JSON request and it does not have a default value in the data class a
ValueError error will be raised. Additionally if the Json value is [], {}, "" will default to the default
value, and if the default value is missing an InvalidRequest error will also be raised.
Parameters:
inp: Json input
"""
_json = defaultdict(lambda: None)
_json.update(inp)
for name, _ in cls.__dataclass_fields__.items():
if (not _json[name]) and (name not in cls.__dict__.keys()):
raise ValueError(f"Request is missing the {name} field")
value = _json[name] or cls.__dict__[name]
setattr(cls, name, value)
def __str__(cls):
rep = {name: getattr(cls, name) for name, _ in cls.__dataclass_fields__.items()}
return json.dumps(rep, indent=4)
def generate_input():
"""
Stub method for generating input
"""
return {
"email_list": [f"{name}@yahoo.com" for name in ["peter", "mark", "alysa"]],
"message": "Foo bar fizzbuzz",
"subject": "Sample Issue",
"info": {
"time": 1619628747.9166002,
"count": 3,
}
}
@dataclass
class EmailChain(metaclass=JSONRequest):
email_list: List[str] = field(init=False)
message: str = field(init=False)
subject: str = field(init=False)
info: Dict[str, Any] = field(init=False)
inp = generate_input
# serialization.py
import dill
from sample_issue.dataclass import EmailChain
obj = EmailChain()
data_stream = dill.dumps(obj)
print(data_stream)
# output: b'\x80\x03csrc.Test\nEmailChain\nq\x00.'
# deserialization
import dill
from sample_issue.dataclass import EmailChain
input = b'\x80\x03csrc.Test\nEmailChain\nq\x00.'
obj = dill.loads(input)
print(obj)
# Results in error since obj is missing data class fields for __str__ method
Working Case
# Working-case
import dill
from dataclasses import dataclass, field
from typing import List, Dict, Any
import json
from collections import defaultdict
class JSONRequest(type):
def __call__(cls, *args, **kwargs):
"""
This is a metaclass used to autonomously populate dataclasses
NOTE: This metaclass only works with dataclasses
Optional Parameter:
inp: class attribute that is a callable that produces a dictionary
"""
inp = cls.__dict__["inp"]()
cls.__initialize(inp)
return cls
def __initialize(cls, inp: dict) -> None:
"""
Initializes all of the dataclasses fields
If the field is missing in the JSON request and it does not have a default value in the data class a
ValueError error will be raised. Additionally if the Json value is [], {}, "" will default to the default
value, and if the default value is missing an InvalidRequest error will also be raised.
Parameters:
inp: Json input
"""
_json = defaultdict(lambda: None)
_json.update(inp)
for name, _ in cls.__dataclass_fields__.items():
if (not _json[name]) and (name not in cls.__dict__.keys()):
raise ValueError(f"Request is missing the {name} field")
value = _json[name] or cls.__dict__[name]
setattr(cls, name, value)
def __str__(cls):
rep = {name: getattr(cls, name) for name, _ in cls.__dataclass_fields__.items()}
return json.dumps(rep, indent=4)
def generate_input():
"""
Stub method for generating input
"""
return {
"email_list": [f"{name}@yahoo.com" for name in ["peter", "mark", "alysa"]],
"message": "Foo bar fizzbuzz",
"subject": "Sample Issue",
"info": {
"time": 1619628747.9166002,
"count": 3,
}
}
@dataclass
class EmailChain(metaclass=JSONRequest):
email_list: List[str] = field(init=False)
message: str = field(init=False)
subject: str = field(init=False)
info: Dict[str, Any] = field(init=False)
inp = generate_input
obj = EmailChain()
data_stream = dill.dumps(obj)
print(data_stream)
# output: b'\x80\x03cdill._dill\n_create_type\nq\x00(h\x00(cdill._dill\n_load_type\nq\x01X\x04\x00\x00\x00typeq\x02\x85q\x03Rq\x04X\x0b\x00\x00\x00JSONRequestq\x05h\x04\x85q\x06}q\x07(X\n\x00\x00\x00__module__q\x08X\x08\x00\x00\x00__main__q\tX\x08\x00\x00\x00__call__q\ncdill._dill\n_create_function\nq\x0b(cdill._dill\n_create_code\nq\x0c(K\x01K\x00K\x04K\x03KOC\x1a|\x00j\x00d\x01\x19\x00\x83\x00}\x03|\x00\xa0\x01|\x03\xa1\x01\x01\x00|\x00S\x00q\rX\xf5\x00\x00\x00\n This is a metaclass used to autonomously populate dataclasses\n\n NOTE: This metaclass only works with dataclasses\n\n Optional Parameter:\n inp: class attribute that is a callable that produces a dictionary\n q\x0eX\x03\x00\x00\x00inpq\x0f\x86q\x10X\x08\x00\x00\x00__dict__q\x11X\x18\x00\x00\x00_JSONRequest__initializeq\x12\x86q\x13(X\x03\x00\x00\x00clsq\x14X\x04\x00\x00\x00argsq\x15X\x06\x00\x00\x00kwargsq\x16h\x0ftq\x17XL\x00\x00\x00C:/Users/739908/Work/systems_automation/report_automation_engine/src/temp.pyq\x18h\nK\nC\x06\x00\t\x0c\x01\n\x01q\x19))tq\x1aRq\x1bc__builtin__\n__main__\nh\nNN}q\x1cNtq\x1dRq\x1eh\x12h\x0b(h\x0c(K\x02K\x00K\x06K\x05KCCvt\x00d\x01d\x02\x84\x00\x83\x01}\x02|\x02\xa0\x01|\x01\xa1\x01\x01\x00xZ|\x00j\x02\xa0\x03\xa1\x00D\x00]L\\\x02}\x03}\x04|\x02|\x03\x19\x00sP|\x03|\x00j\x04\xa0\x05\xa1\x00k\x07rPt\x06d\x03|\x03\x9b\x00d\x04\x9d\x03\x83\x01\x82\x01|\x02|\x03\x19\x00p`|\x00j\x04|\x03\x19\x00}\x05t\x07|\x00|\x03|\x05\x83\x03\x01\x00q"W\x00d\x05S\x00q\x1f(X\xac\x01\x00\x00\n Initializes all of the dataclasses fields\n\n If the field is missing in the JSON request and it does not have a default value in the data class a\n ValueError error will be raised. Additionally if the Json value is [], {}, "" will default to the default\n value, and if the default value is missing an InvalidRequest error will also be raised.\n\n Parameters:\n inp: Json input\n q h\x0c(K\x00K\x00K\x00K\x01KSC\x04d\x00S\x00q!N\x85q"))h\x18X\x08\x00\x00\x00<lambda>q#K"C\x00q$))tq%Rq&X*\x00\x00\x00JSONRequest.__initialize.<locals>.<lambda>q\'X\x17\x00\x00\x00Request is missing the q(X\x06\x00\x00\x00 fieldq)Ntq*(X\x0b\x00\x00\x00defaultdictq+X\x06\x00\x00\x00updateq,X\x14\x00\x00\x00__dataclass_fields__q-X\x05\x00\x00\x00itemsq.h\x11X\x04\x00\x00\x00keysq/X\n\x00\x00\x00ValueErrorq0X\x07\x00\x00\x00setattrq1tq2(h\x14h\x0fX\x05\x00\x00\x00_jsonq3X\x04\x00\x00\x00nameq4X\x01\x00\x00\x00_q5X\x05\x00\x00\x00valueq6tq7h\x18X\x0c\x00\x00\x00__initializeq8K\x17C\x0e\x00\x0b\x0c\x01\n\x02\x14\x01\x16\x01\x10\x02\x12\x01q9))tq:Rq;c__builtin__\n__main__\nh8NN}q<Ntq=Rq>X\x07\x00\x00\x00__str__q?h\x0b(h\x0c(K\x01K\x00K\x02K\x04K\x03C&\x87\x00f\x01d\x01d\x02\x84\x08\x88\x00j\x00\xa0\x01\xa1\x00D\x00\x83\x01}\x01t\x02j\x03|\x01d\x03d\x04\x8d\x02S\x00q@(Nh\x0c(K\x01K\x00K\x03K\x05K\x13C\x1ci\x00|\x00]\x14\\\x02}\x01}\x02t\x00\x88\x00|\x01\x83\x02|\x01\x93\x02q\x04S\x00qA)X\x07\x00\x00\x00getattrqB\x85qCX\x02\x00\x00\x00.0qDh4h5\x87qEh\x18X\n\x00\x00\x00<dictcomp>qFK-C\x02\x06\x00qGh\x14\x85qH)tqIRqJX\'\x00\x00\x00JSONRequest.__str__.<locals>.<dictcomp>qKK\x04X\x06\x00\x00\x00indentqL\x85qMtqN(h-h.X\x04\x00\x00\x00jsonqOX\x05\x00\x00\x00dumpsqPtqQh\x14X\x03\x00\x00\x00repqR\x86qSh\x18h?K,C\x04\x00\x01\x18\x01qT)h\x14\x85qUtqVRqWc__builtin__\n__main__\nh?NN}qXNtqYRqZX\x07\x00\x00\x00__doc__q[Nutq\\Rq]X\n\x00\x00\x00EmailChainq^h\x01X\x06\x00\x00\x00objectq_\x85q`Rqa\x85qb}qc(h\x08h\tX\x0f\x00\x00\x00__annotations__qd}qe(X\n\x00\x00\x00email_listqfcdill._dill\n_get_attr\nqgcdill._dill\n_import_module\nqhX\t\x00\x00\x00_operatorqi\x85qjRqkX\x07\x00\x00\x00getitemql\x86qmRqnctyping\nList\nqoh\x01X\x03\x00\x00\x00strqp\x85qqRqr\x86qsRqtX\x07\x00\x00\x00messagequhrX\x07\x00\x00\x00subjectqvhrX\x04\x00\x00\x00infoqwhnctyping\nDict\nqxhrctyping\nAny\nqy\x86qz\x86q{Rq|uh\x0fh\x0b(h\x0c(K\x00K\x00K\x00K\x06KCC\x1ed\x01d\x02\x84\x00d\x03D\x00\x83\x01d\x04d\x05d\x06d\x07d\x08\x9c\x02d\t\x9c\x04S\x00q}(X*\x00\x00\x00\n Stub method for generating input\n q~h\x0c(K\x01K\x00K\x02K\x04KSC\x16g\x00|\x00]\x0e}\x01|\x01\x9b\x00d\x00\x9d\x02\x91\x02q\x04S\x00q\x7fX\n\x00\x00\[email protected]\x80\x85q\x81)hDh4\x86q\x82h\x18X\n\x00\x00\x00<listcomp>q\x83K6C\x02\x06\x00q\x84))tq\x85Rq\x86X"\x00\x00\x00generate_input.<locals>.<listcomp>q\x87X\x05\x00\x00\x00peterq\x88X\x04\x00\x00\x00markq\x89X\x05\x00\x00\x00alysaq\x8a\x87q\x8bX\x10\x00\x00\x00Foo bar fizzbuzzq\x8cX\x0c\x00\x00\x00Sample Issueq\x8dGA\xd8"d\xb2\xfa\xa9\x94K\x03X\x04\x00\x00\x00timeq\x8eX\x05\x00\x00\x00countq\x8f\x86q\x90(hfhuhvhwtq\x91tq\x92))h\x18X\x0e\x00\x00\x00generate_inputq\x93K1C\n\x00\x05\x0c\x01\x02\x01\x02\x02\x02\x01q\x94))tq\x95Rq\x96c__builtin__\n__main__\nh\x93NN}q\x97Ntq\x98Rq\x99h[X\x1b\x00\x00\x00EmailChain(*args, **kwargs)q\x9aX\x14\x00\x00\x00__dataclass_params__q\x9bcdataclasses\n_DataclassParams\nq\x9c)\x81q\x9dN}q\x9e(X\x04\x00\x00\x00initq\x9f\x88X\x04\x00\x00\x00reprq\xa0\x88X\x02\x00\x00\x00eqq\xa1\x88X\x05\x00\x00\x00orderq\xa2\x89X\x0b\x00\x00\x00unsafe_hashq\xa3\x89X\x06\x00\x00\x00frozenq\xa4\x89u\x86q\xa5bh-}q\xa6(hfcdataclasses\nField\nq\xa7)\x81q\xa8N}q\xa9(h4hfh\x02htX\x07\x00\x00\x00defaultq\xaacdataclasses\n_MISSING_TYPE\nq\xab)\x81q\xacX\x0f\x00\x00\x00default_factoryq\xadh\xach\xa0\x88X\x04\x00\x00\x00hashq\xaeNh\x9f\x89X\x07\x00\x00\x00compareq\xaf\x88X\x08\x00\x00\x00metadataq\xb0h\x01X\x10\x00\x00\x00MappingProxyTypeq\xb1\x85q\xb2Rq\xb3}q\xb4\x85q\xb5Rq\xb6X\x0b\x00\x00\x00_field_typeq\xb7cdataclasses\n_FIELD_BASE\nq\xb8)\x81q\xb9}q\xbah4X\x06\x00\x00\x00_FIELDq\xbbsbu\x86q\xbcbhuh\xa7)\x81q\xbdN}q\xbe(h4huh\x02hrh\xaah\xach\xadh\xach\xa0\x88h\xaeNh\x9f\x89h\xaf\x88h\xb0h\xb6h\xb7h\xb9u\x86q\xbfbhvh\xa7)\x81q\xc0N}q\xc1(h4hvh\x02hrh\xaah\xach\xadh\xach\xa0\x88h\xaeNh\x9f\x89h\xaf\x88h\xb0h\xb6h\xb7h\xb9u\x86q\xc2bhwh\xa7)\x81q\xc3N}q\xc4(h4hwh\x02h|h\xaah\xach\xadh\xach\xa0\x88h\xaeNh\x9f\x89h\xaf\x88h\xb0h\xb6h\xb7h\xb9u\x86q\xc5buX\x08\x00\x00\x00__init__q\xc6h\x0b(h\x0c(K\x01K\x00K\x01K\x01KCC\x04d\x00S\x00q\xc7N\x85q\xc8)X\x04\x00\x00\x00selfq\xc9\x85q\xcaX\x08\x00\x00\x00<string>q\xcbh\xc6K\x01C\x02\x00\x01q\xcc))tq\xcdRq\xce}q\xcf(X\x07\x00\x00\x00MISSINGq\xd0h\xacX\x14\x00\x00\x00_HAS_DEFAULT_FACTORYq\xd1cdataclasses\n_HAS_DEFAULT_FACTORY_CLASS\nq\xd2)\x81q\xd3X\x0c\x00\x00\x00__builtins__q\xd4hhX\x08\x00\x00\x00builtinsq\xd5\x85q\xd6Rq\xd7uh\xc6NN}q\xd8Ntq\xd9Rq\xdaX\x08\x00\x00\x00__repr__q\xdbh\x0b(h\x0c(K\x01K\x00K\x03K\tK\x13CDt\x00|\x00\x83\x01t\x01\xa0\x02\xa1\x00f\x02}\x01|\x01\x88\x00k\x06r\x1cd\x01S\x00\x88\x00\xa0\x03|\x01\xa1\x01\x01\x00z\x0c\x88\x01|\x00\x83\x01}\x02W\x00d\x00\x88\x00\xa0\x04|\x01\xa1\x01\x01\x00X\x00|\x02S\x00q\xdcNX\x03\x00\x00\x00...q\xdd\x86q\xde(X\x02\x00\x00\x00idq\xdfX\x07\x00\x00\x00_threadq\xe0X\t\x00\x00\x00get_identq\xe1X\x03\x00\x00\x00addq\xe2X\x07\x00\x00\x00discardq\xe3tq\xe4h\xc9X\x03\x00\x00\x00keyq\xe5X\x06\x00\x00\x00resultq\xe6\x87q\xe7X\'\x00\x00\x00C:\\DevApps\\Python3.7\\lib\\dataclasses.pyq\xe8X\x07\x00\x00\x00wrapperq\xe9M^\x01C\x10\x00\x02\x10\x01\x08\x01\x04\x01\n\x01\x02\x01\x0c\x02\x0c\x01q\xeaX\x0c\x00\x00\x00repr_runningq\xebX\r\x00\x00\x00user_functionq\xec\x86q\xed)tq\xeeRq\xefcdataclasses\n__dict__\nh\xdbNcdill._dill\n_create_cell\nq\xf0h\x01X\x03\x00\x00\x00setq\xf1\x85q\xf2Rq\xf3]q\xf4\x85q\xf5Rq\xf6\x85q\xf7Rq\xf8h\xf0h\x0b(h\x0c(K\x01K\x00K\x01K\nKCC.|\x00j\x00j\x01d\x01|\x00j\x02\x9b\x02d\x02|\x00j\x03\x9b\x02d\x03|\x00j\x04\x9b\x02d\x04|\x00j\x05\x9b\x02d\x05\x9d\t\x17\x00S\x00q\xf9(NX\x0c\x00\x00\x00(email_list=q\xfaX\n\x00\x00\x00, message=q\xfbX\n\x00\x00\x00, subject=q\xfcX\x07\x00\x00\x00, info=q\xfdX\x01\x00\x00\x00)q\xfetq\xff(X\t\x00\x00\x00__class__r\x00\x01\x00\x00X\x0c\x00\x00\x00__qualname__r\x01\x01\x00\x00hfhuhvhwtr\x02\x01\x00\x00h\xc9\x85r\x03\x01\x00\x00h\xcbh\xdbK\x01C\x02\x00\x01r\x04\x01\x00\x00))tr\x05\x01\x00\x00Rr\x06\x01\x00\x00cdataclasses\n__dict__\nh\xdbNN}r\x07\x01\x00\x00Ntr\x08\x01\x00\x00Rr\t\x01\x00\x00\x85r\n\x01\x00\x00Rr\x0b\x01\x00\x00\x86r\x0c\x01\x00\x00}r\r\x01\x00\x00X\x0b\x00\x00\x00__wrapped__r\x0e\x01\x00\x00j\t\x01\x00\x00sNtr\x0f\x01\x00\x00Rr\x10\x01\x00\x00X\x06\x00\x00\x00__eq__r\x11\x01\x00\x00h\x0b(h\x0c(K\x02K\x00K\x02K\x05KCC8|\x01j\x00|\x00j\x00k\x08r4|\x00j\x01|\x00j\x02|\x00j\x03|\x00j\x04f\x04|\x01j\x01|\x01j\x02|\x01j\x03|\x01j\x04f\x04k\x02S\x00t\x05S\x00r\x12\x01\x00\x00N\x85r\x13\x01\x00\x00(j\x00\x01\x00\x00hfhuhvhwX\x0e\x00\x00\x00NotImplementedr\x14\x01\x00\x00tr\x15\x01\x00\x00h\xc9X\x05\x00\x00\x00otherr\x16\x01\x00\x00\x86r\x17\x01\x00\x00h\xcbj\x11\x01\x00\x00K\x01C\x06\x00\x01\x0c\x01(\x01r\x18\x01\x00\x00))tr\x19\x01\x00\x00Rr\x1a\x01\x00\x00cdataclasses\n__dict__\nj\x11\x01\x00\x00NN}r\x1b\x01\x00\x00Ntr\x1c\x01\x00\x00Rr\x1d\x01\x00\x00X\x08\x00\x00\x00__hash__r\x1e\x01\x00\x00Nhf]r\x1f\x01\x00\x00(X\x0f\x00\x00\[email protected] \x01\x00\x00X\x0e\x00\x00\[email protected]!\x01\x00\x00X\x0f\x00\x00\[email protected]"\x01\x00\x00ehuh\x8chvh\x8dhw}r#\x01\x00\x00(h\x8eGA\xd8"d\xb2\xfa\xa9\x94h\x8fK\x03uutr$\x01\x00\x00Rr%\x01\x00\x00.'
c = dill.loads(data_stream)
print(c)
"""
output:
{
"email_list": [
"[email protected]",
"[email protected]",
"[email protected]"
],
"message": "Foo bar fizzbuzz",
"subject": "Sample Issue",
"info": {
"time": 1619628747.9166002,
"count": 3
}
}
"""
I have figured out the issue
dill
needed the metaclass type register for marshalling. Below is the updated metaclass and type registration for dill that solved the issue.Updated Metaclass
Dill Type Registration