Convert image to a fixed format for throwing away all the extra annotations

209 Views Asked by At

I am trying to implement attachments in my application and user is able to upload image files (png, jpg, jpeg). I have read OWASP recommendations for image uploads, and one of the tips was to - convert the input image to a bitmap (keeping only the bitmap data, and throwing away all the extra annotations), then convert the bitmap to your desired output format. One reasonable way to do this is to convert to PBM format, then convert to PNG.

Image is saved as byte array.

I am trying to rewrite uploaded image by using ImageTranscoder from ImageIO library. But i am not really sure what it is doing, and if all the possibly malicious code is removed from image, because it seems that only metadata is being rewritten.

Is there any suggestions, best practices, of how desired goal should be achieved to remove all possibly malicious code inside image file?

1

There are 1 best solutions below

0
On BEST ANSWER

You do not need an intermediate file format like PBM, as BufferedImage (which is the standard way of representing an in-memory bitmap in Java) is just plain pixel data. You can just go from encoded "anything" to decoded bitmap to encoded PNG.

The simplest way you could possibly do what you describe is:

ImageIO.write(ImageIO.read(input), "PNG", output);

This is rather naive code, and will break for many real-world files, or possibly just silently not output anything. You probably want to handle at least the most normal error cases, so something like below:

BufferedImage image = ImageIO.read(input);
if (image == null) {
   // TODO: Handle image not read (decoded)
}
else if (!ImageIO.write(image, "PNG", output)) {
   // TODO: Handle image not written (could not be encoded as PNG)
}

Other things to consider: The above will remove malicious code in the meta data. However, there might be special images crafted for DoS (small files decoding to huge in-memory representations, TIFF IFD loops, and much more). These problems need to be addressed in the image decoders for the various input formats. But at least your output files should be safe from this.

In addition, malicious code could be stored in the ICC profile, which might be carried over to the output image. You can probably avoid this by force converting all images to the built-in sRGB color space, or writing the images without ICC profiles.


PS: The ImageTranscoder interface is intended for situations where you want to keep as much meta data as possible (that is why it has methods only for meta data), and allows transformation of meta data from one file format to another (one could argue the name should have been MetadataTranscoder).