I don't know what's going on behind the scenes with these objects, but I use them all the time with great success.
dim ie As New SHDocVw.InternetExplorer
ie.navigate url
Set oDoc = ie.Document
iT = oDoc.body.innerText
AND
dim oX As New MSXML2.XMLHTTP
dim oDoc As HTMLDocument
oX.Open "GET", url, False
oX.send
rT = oX.responseText
But for one particular webpage the ie object hangs, creates errors, makes my hard-drive thrash and frustrates me completely.
I like the MSXML2 object because it's always fast and trouble-free. The problem on this particular webpage is it is not returning the right info. It seems like it is getting the wrong page or it's not getting the whole thing.
My minimum requirement is getting the entire html for the page -- I can parse from there.
Here is an example of the pages I'm trying to get: http://www.nfl.com/gamecenter/2011090800/2011/REG1/saints@packers?icampaign=GC_schedule_rr#menu=highlights&tab=analyze&analyze=playbyplay
My ambition is to get them all, back to about 2001 when they started keeping the play-by-play record. Something about the technology they're using is upsetting ie and blocking msxml2.
Any suggestions to get me going in the right direction?
This isn't the sexiest solution but here you go:
If you put this in a loop, just make sure you delete out the query tables, otherwise excel will stop running when there are too many connections.