Java: How to efficiently process Zipfile reading and create byte[] using Multithreading and Async

321 Views Asked by At

I am currently developing a method in the Service layer implementation where the method receives a .zip file (file size could go up to 600~700MB) as a Multipart file. Out of all the files zipped in that Multipart file, there are only 4-5 JSON files of interest to me which I am reading from the zip using ZipInputStream and storing them as String values for further usage.

Service class:

@Async("taskExecutor")
public CompletableFuture<ResponseEntity<?>> methodname(MultipartFile file){

    ZipEntry entry = null;
    try(ZipInputStream zipFileStream = new ZipInputStream(file.getInputStream())){
        while((entry = zipFileStream.getNextEntry) != null){
            String entryName = entry.getName();
            
            if(entryName.contains("<file1name>")){
            BufferedReader br = new BufferedReader(new InputStreamReader(zipFileStream));
            String value1 = br.lines().collect(Collectors.joining("\n"));
            zipFileStream.closeEntry();
            }
            
            if(entryName.contains("<file2name>")){
            BufferedReader br = new BufferedReader(new InputStreamReader(zipFileStream));
            String value2 = br.lines().collect(Collectors.joining("\n"));
            zipFileStream.closeEntry();
            }
            
            if(entryName.contains("<file3name>")){
            BufferedReader br = new BufferedReader(new InputStreamReader(zipFileStream));
            String value3 = br.lines().collect(Collectors.joining("\n"));
            zipFileStream.closeEntry();
            }
        }
    }
    
    //String value1 & String value2 merged based on some condition to finally prepare String value1.
    //some logic to prepare a file
    
    if(fileExists){
        //create byte[] and Httpheaders with content disposition and mediatype and send CompletableFuture<ResponseEntity<?>>
    }
}

I have annotated the method @Async (as I have created an Executor bean in config class), still I have not been able to figure out how can I run different processes of this methods asynchronously or in multi-threaded way to make the processing faster. The entire process still runs on single thread from that executor service pool.

Can anyone please advise how can I introduce asynchronous or multi thread processing in my above method, so that concurrent processes like

  • Reading the Zip file
  • Creating the final byte[]

can be done a little bit faster to reduce the overall response time.

1

There are 1 best solutions below

7
Marc Stroebel On

store MultipartFile to temp file and try ZipFile (which supports streams ootb)

final ZipFile zipFile = new ZipFile("dummy.zip");
zipFile
  .stream()
  .parallel()
  .filter(entry -> entry.getName().matches("regexFile1")
    || entry.getName().matches("regexFile2")
    || entry.getName().matches("regexFile3")
  )
  .map(entry -> {
    try {
      return new EntryDto(entry.getName(), new String(zipFile.getInputStream(entry).readAllBytes(), StandardCharsets.UTF_8));
    } catch (IOException e) {
      throw new RuntimeException(e);
    }
  })
  .map(dto -> {
    // custom logic
    return ...;
  })
  .collect(Collectors.toList());

dto class

class EntryDto {
    private String name;
    private String json;

    public EntryDto(String name, String json) {
        this.name = name;
        this.json = json;
    }

    public String getName() {
        return name;
    }

    public String getJson() {
        return json;
    }
}