I'm trying to parse a specification website from saved HTML on my computer. I can post the file upon request.
I'm burnt out trying to figure out why it won't run synchronously. The comments should log the CCCC
's first, then BBBB
's, then finally one AAAA
.
The code I'm running will not wait at the first hurdle (it prints AAAA...
first). Am I using request-promise
incorrectly? What is going on?
Is this due to the .each()
method of cheerio
(I'm assuming it's synchronous)?
const rp = require('request-promise');
const fs = require('fs');
const cheerio = require('cheerio');
async function parseAutodeskSpec(contentsHtmlFile) {
const topics = [];
const contentsPage = cheerio.load(fs.readFileSync(contentsHtmlFile).toString());
const contentsSelector = '.content_htmlbody table td div div#divtreed0e338374 nobr .toc_entry a.treeitem';
contentsPage(contentsSelector).each(async (idx, topicsAnchor) => {
const topicsHtml = await rp(topicsAnchor.attribs['href']);
console.log("topicsHtml.length: ", topicsHtml.length);
});
console.log("AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA");
return topics;
}
Try it this way:
Now the await is outside of map or each which doesn't quite work the way you think.