disclaimer, self-answered post to hopefully save others time.
Setup:
I've been using chrome's implementation of the file systems API, [1] [2] [3].
This requires enabling the flag chrome://flags/#native-file-system-api.
For starters I want to recursively read a directory and obtain a list of files. This is simple enough:
paths = [];
let recursiveRead = async (path, handle) => {
let reads = [];
// window.handle = handle;
for await (let entry of await handle.getEntries()) { // <<< HANGING
if (entry.isFile)
paths.push(path.concat(entry.name));
else if (/* check some whitelist criteria to restrict which dirs are read*/)
reads.push(recursiveRead(path.concat(entry.name), entry));
}
await Promise.all(reads);
console.log('done', path, paths.length);
};
chooseFileSystemEntries({type: 'openDirectory'}).then(handle => {
recursiveRead([], handle).then(() => {
console.log('COMPLETELY DONE', paths.length);
});
});
I've also implemented a non-recursive while-loop-queue version. And lastly, I've implemented a node fs.readdir
version. All 3 solutions work fine for small directories.
The problem:
But then I tried running it on some sub-directories of the chromium source code ('base', 'components', and 'chrome'); together the 3 sub-dirs consist of ~63,000 files. While the node implementation worked fine (and surprisingly it utilized cached results between runs, resulting in instantaneous runs after the first), both browser implementations hung.
Attempted debugging:
Sometimes, they would return the full 63k files and print 'COMPLETLEY DONE'
as expected. But most often (90% of the time) they would read 10k-40k files before hanging.
I dug deeper into the hanging, and apparently the for await
line was hanging. So I added the line window.handle = handle
immediately before the for loop; when the function hung, I ran the for loop directly in the browser console, and it worked correctly! So now I'm stuck. I have seemingly working code that randomly hangs.
Solution:
I tried skipping over directories that would hang:
And the results showed a pattern. Once a directory read hung and was skipped, the subsequent ~10 dir reads would likewise hang and be skipped. Then the following reads would resume functioning properly until the next similar incident.
So the issue seemed temporal. I added a simple retry wrapper with a 500ms wait between retries, and the reads began working fine.
Conclusion:
The non-standard Native File System API hangs when reading large directories. Simply retrying after waiting resolves the issue. Took me a good week to arrive at this solution, so thought it'd be worth sharing.