Can't get both filename of csv and its content with async read inside zip

406 Views Asked by At

I'm trying to code a function that loads a zip file containing multiple other zip files, iterate through those and extract all the csv files found to merge and extract the result into one aggregated csv. The problem I ran into is that I can't map the content of the csv to its filename or directory with async. Here's a snippet where myzip2 is a JSZIP object, files is a dictionary of the filenames and csvfiles is my output array.

    for (var key in files) {
        if ( files[key].name.includes('.csv') ) {
            myzip2.file( files[key].name ).async("string").then( function(result) {
                csvfiles.push( $.csv.toArrays(result) ); // contains the csv content
                csvfiles.push( files[key].name ); // undefined in async
            });
        )
    )

I would like to push both the csv content and the filename at the same time, but the filename is undefined inside the function. How can I get it? I haven't had much chance googling my issue, I probably lack the correct wording.

Thanks


If anyone runs into the same problem, matching the filename to the content is a closure issue fixed by arrow functions but matching the correct zipfile name is an issue with JSZip appending to the current object. Use new JSZip() if you need to clear the old one.

Full code, zipfiles is a list of zipfiles name contained inside the mainzip

var csvfiles = []
for (var i = 0; i < zipfiles.length; i++) {

    let thiszipname = zipfiles[i];
    let thiszip = mainzip.file( thiszipname ).async("blob");

    var newzip = new JSZip();
    newzip.loadAsync( thiszip ).then(
        subzip => {
            let subfiles = subzip.files;
            Object.keys(subfiles).forEach( filename => {

                if ( filename.includes('.csv') ) {
                    subzip.file( filename ).async("string").then(
                        readData => {
                            var obj = $.csv.toObjects(readData); // works great
                            obj['csvfile'] = filename            // correct file
                            obj['zipfile'] = thiszipname;        // incorrect zipfile...
                            csvfiles.push( obj );
                        }
                    );
                }
            })

        }
    )
} console.log(csvfiles)
1

There are 1 best solutions below

9
On BEST ANSWER

The problem is related to the lexical scope of the files variable. It is not related to async.

A closure is the combination of a function bundled together (enclosed) with references to its surrounding state (the lexical environment). In other words, a closure gives you access to an outer function’s scope from an inner function. In JavaScript, closures are created every time a function is created, at function creation time.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Closures


As you are doing this in the loop, you need to take care of the "closure in loop" problem.

Please check out the section "Creating closures in loops: A common mistake" following the same link: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Closures


You should replace your anonymous function with an arrow function and it will work.

for (var key in files) {
    // It's important to set the value of the `files[key]` to the separate variable.
    // Because otherwise the last element of the `files` will be captured in the closure for the expression `files[key]`.
    const fileName = files[key].name;
    myzip2
        .file(fileName)
        .async("string")
        // Arrow function `result => {}` creates a closure that captures `fileName` variable from the external scope
        .then(result => {
            csvfiles.push($.csv.toArrays(result));
            csvfiles.push(fileName);
        });
}