Verify image in Google Cloud Storage Bucket

431 Views Asked by At

I have system that creates uploadable link to Google Cloud Storage Bucket uploads. After that user is uploading it directly there from Frontend.

Is there a way to verify this image file there without downloading it to a Backend app and verify there (e.g. using PIL for python)?

Verification for:

  • is it an image at all;
  • is it fully uploaded;
  • is it not broken;
  • etc.

P.S. is there anything similar for PDF?

1

There are 1 best solutions below

0
On

Cloud Storage doesn't directly offer any direct support for any particular formats, be it JPEG or PDF or anything else. To fully validate what's in a file, you need to download it and check.

You can, however, get part of the way there.

First, you can have your client validate the file, then capture the size and/or a checksum (either MD5 or CRC32c) of the original file, and you can specify them as part of the upload to ensure that they are uploaded exactly as intended. If your server can know the intended file size or checksum, you can ask Cloud Storage for just the metadata of an object without downloading it to verify that it is as intended.

Second, many files, including JPEG, have particular headers or footers that describe their contents. Instead of downloading what is potentially a very large image, you could download only the first few bytes from Cloud Storage. If the first two bytes aren't 0xFF and 0xD8, then it's not a JPEG file. Similar magic numbers exist for many other formats.