When I was locally serving media and I needed to process a task getting the file contents was very straight-forward. However I just shifted over to django-storages and it's not a drop in replacement. Can someone provide me a method which will pull the document off of S3 so I can process it.
Old way:
filename = settings.MEDIA_ROOT + "/" + document.name
xlsx = XLSXParser(filename = filename, uniq_header_column='XYX')
However I shifted over to django-storages and this (obviously) will not work. How do you pull a local copy of the file from s3 to process it. I thought I could simply do this:
New (failing) way:
filename = settings.MEDIA_ROOT + "/" + document.name
if not os.path.isfile(filename):
new_filename = tempfile.NamedTemporaryFile(delete=False)
new_filename.write(document.read())
filename = new_filename
xlsx = XLSXParser(filename = filename, uniq_header_column='XYX')
But I can't do a read() on this as it bombs.
Traceback (most recent call last):
File ".../celery/task/trace.py", line 212, in trace_task
R = retval = fun(*args, **kwargs)
File ".../tasks.py", line 63, in process_homes
process_homes_non_task(**kwargs)
File ".../tasks.py", line 33, in process_homes_non_task
new_filename.write(document.read())
File ".../django/core/files/utils.py", line 16, in <lambda>
read = property(lambda self: self.file.read)
File ".../django/db/models/fields/files.py", line 46, in _get_file
self._file = self.storage.open(self.name, 'rb')
AttributeError: 'FieldFile' object has no attribute 'storage'
In the end I need it work with both the old way and the new way. Clearly I am over-thinking this a bit..
Update:
Following the docs didn't help either.
filename = settings.MEDIA_ROOT + "/" + document.name
if not os.path.isfile(filename):
from django.core.files.storage import default_storage
s3_file = default_storage.open(document.name, 'rb')
new_filename = tempfile.NamedTemporaryFile(delete=False)
new_filename.write(s3_file.read())
filename = new_filename
xlsx = XLSXParser(filename = filename, uniq_header_column='Lot_Number')
xlsx.load_workbook_and_sheet()
Thanks for the help.