so I've been web scraping with Cheerio and I'm able to find the particular HTML element that I'm looking for, but for some reason, the text is not there.
For example in my web browser, when I inspect element I see <a href = "#" data-bind="text: MovieName, attr: { href: DetailsUrl }">Why Him?</a>
.
But, when I print out the object while scraping I see, <a href = "#" data-bind="text: MovieName, attr: { href: DetailsUrl }"></a>
so when I call the .text() function, it doesn't return anything. Why does this happen?
Inspect Element is not a valid test that Cheerio will be able to see something. You must use View Source instead.
Inspect Element is a live view of how the browser has rendered an element after applying all of the various technologies that exist in a browser, including CSS and JavaScript. View Source, on the other hand, is the raw code that the server sent to the browser, which you can generally expect to be the same as what Cheerio will receive. That is, assuming you ensure the HTTP headers are identical, particularly the ones relevant to content negotiation.
It is important to understand that while Cheerio is a DOM parser, it does not simulate a browser. So if the text is added via JavaScript, for example, then the text will not be there because that JavaScript will not have run.
If browser simulation is important to you, you should look into using PhantomJS. If you need a highly realistic browser rendering setup, then look into WebDriver and Leadfoot.