Issues multi-threading the loading of thousands of images causing IOException

504 Views Asked by At

I am having an issue loading a large amount of images via the ForkJoinPool, I am testing on a 4 core Intel with hyper-theading so 8 logical threads. However, I limit the Pool to only 4 Threads. And I receive errors from ImageIO not being able to find the image.

public class LoadImages extends RecursiveAction {
private static final long serialVersionUID = 1L;

//this is an example
private static int threadThreshold = totalImages/totalThreads + 2;

private String[] imgArr;
private int arrStart = 0;
private int arrSize = 0;

public LoadImages(String[] imgs, int start, int size) {
    imgArr = imgs;
    arrSize = size;
    arrStart = start;
}

protected void processImages(){
    BufferedImage img = null;
    for (int i = arrStart; i < arrStart + arrSize; i++) {
        try{
            img = ImageIO.read(new File(imgArr[i]));    
        } catch (IOException | CMMException | NullPointerException e) {
            System.out.println(imgArr[i]);
            e.printStackTrace();
            img = null;
        }

        ...

    }
}

protected void compute() {
    // Check the number of files
    if (arrSize <= threadThreshold) {
        processImages();
        return;
    } else {

        int split = arrSize / 2;

        invokeAll(new LoadImages(imgArr, arrStart, split), new LoadImages(imgArr, arrStart + split, arrSize - split));
    }

}
}

Any insight on what I am doing wrong would be great, I notice it really only breaks if I have over 1700+ images and all the images are 5MB and over.

Here is the error I am receiving from Java:

javax.imageio.IIOException: Can't create an ImageInputStream!
at javax.imageio.ImageIO.read(Unknown Source)

When I know the file is there. I used this code as a guide: https://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html

3

There are 3 best solutions below

1
On

Seems kind of random. My guess is it could just be a hardware or OS error. Assuming this is a scaling issue, my advise with your 1700+ images is that you'd probably be better off setting this up on the cloud somewhere - could save a lot of time and headaches

3
On

If you go inspect the source for ImageIO.read(File) and ImageIO.read(ImageInputStream) you can see that see that ImageIO reuses instances of ImageReader, and this article says that ImageReader is not thread-safe. You'll probably have to create your own ImageReaders for use in the separate threads.

Also you should measure how much this multithreading-IO strategy really gains you. If you're trying to pull gigs of image data off of a spinning hard-drive your process will probably be I/O bound and parallelizing the loading won't give you much.

3
On

It seems to me an ImageIO error when internally it creates the ImageInputStream. Did you try to read the images with an ImageInputStream? Like:

InputStream is = new FileInputStream("path");
ImageInputStream iis = ImageIO.createImageInputStream(is);
BufferedImage bufImage = ImageIO.read(iis);