Marshmallow schema for AWS SimpleDB

106 Views Asked by At

I'm trying to setup some marshmallow classes to simplify putting and getting records from AWS SimpleDB service.

AWS requires a list of name, value pairs in the put_record explained in the interface here. I don't need the optional Replace or Expected attributes for my use case.

So, am looking to build some Marshmallow schemas that simplify the loading (_deserialize) and unloading (_serialize). Since SDB is a string only database, I've already successfully built several field encoders to serialize/deserialize various integers, datetime fields and boolean to strings that can be lexographically sorted. As an example, here is my integer encoder/decoder:

class SDBNumberField(fields.Field):
"""
    Converts a python number into a padded string that is suitable for storage
    in Amazon SimpleDB and can be sorted lexicographically.

    Numbers are shifted by an offset so that negative numbers sort correctly. Once
    shifted, they are converted to zero padded strings.
    """

def __init__(self, padding=10, precision=2, offset=100, **kwargs):
    super().__init__(**kwargs)
    self.padding = padding
    self.precision = precision
    self.offset = offset

def _serialize(self, value, attr, obj, **kwargs):

    padding = self.padding
    if self.precision > 0 and self.padding > 0:
        # Padding shouldn't include decimal digits or the decimal point.
        padding += self.precision + 1
    return ('%%0%d.%df' % (padding, self.precision)) % (value + self.offset)

def _deserialize(self, value, attr, data, **kwargs):
    """
    Decoding converts a string into a numerical type then shifts it by the
    offset.
    """
    try:
        return float(value) - self.offset
    except ValueError as error:
        raise ValidationError("Number fields must contain only numbers: {}".format(value)) from error

The serialization/deserialization works when I have a known schema of name/value pairs.

But, what I'm getting hung up on is coming up with a generic class that just converts a standard dictionary on the dump to a key/value pairs list as there are times when my key/value pairs are unknown until runtime.

What I'd like to see is something like this:

record_to_dump = {'Name': 'test', 'Attributes': {'key1': 'value1', 'key2': 'value2'}}
schema.dump(record_to_dump)
{"Name": "test",
 "Attributes": [{"Name": "key1", "Value" : "value1"}, {"Name": "key2", "Value": "value2"}]}
assert schema.load({"Name": "test",
 "Attributes": [{"Name": "key1", "Value" : "value1"}, {"Name": "key2", "Value": "value2"}]}) == record_to_dump

but, I'm unsure how to setup the classes to handle the conversersion from a standard dictionary to the name/value pair list required by the interface.

I've tried several aprpaoches including fields.Method with load/unload methods for serialize/deserialize. I've also tried to use the post_load and pre_dump decorators. What I seem to be missing is how to handle the fact that the field is one type (dictionary) on serialize and another on deserialize (LIst of dictionaries).

this approach got pretty close:

class SDBAttributesSchema(Schema):
    Attributes = fields.Method(serialize="_put_attributes", deserialize="_get_attributes")

    def _put_attributes(self, obj, **kwargs):
        return [{"Name": k, "Value": v} for k, v in obj.items()]

    def _get_attributes(self, obj, many=True):
        return {d['Name']: d["Value"] for d in obj}

But, when I put this together as a Nested object with another class, it falls apart:

class SDBItemSchema(Schema):
    Name = fields.Str()
    Attributes = fields.Nested(SDBAttributesSchema)

I think I'm confused on:

  • how marshmallow unpacks a standard fields.Dict() into the data record. it doesn't seem to serialize that attribute.
  • how to validate that everything eventually gets serialized as a string (validation of keys and values in a dictionary)

How would you approach class schema objects given the desired dump and load records above?

0

There are 0 best solutions below