I am trying to get data from the web. I tried two avenues in Excel VBA. Both work but yield different results and there is also a noticeable speed difference.
The first method is using Excel Web Query tool.
VBA code I obtained by recording a macro.
ActiveWorkbook.Queries.Add name:="Table 0", Formula:= _
"let" & Chr(13) & "" & Chr(10) & " Source = Web.Page(Web.Contents(""" & webSite & """))," & Chr(13) & "" & Chr(10) & " Data0 = Source{0}[Data]" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & " Data0"
With ActiveSheet.ListObjects.Add(SourceType:=0, Source:="OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 0"";Extended Properties=""""" _
, Destination:=Range("$A$1")).QueryTable
.CommandType = xlCmdSql
.CommandText = Array("SELECT * FROM [Table 0]")
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = False
.RefreshOnFileOpen = False
.BackgroundQuery = False
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = False
.AdjustColumnWidth = True
.RefreshPeriod = 0
.PreserveColumnInfo = True
.ListObject.DisplayName = "Table_0"
.Refresh BackgroundQuery:=False
End With
The second method is using MSXML2.XMLHTTP60. This method is extremely fast but the table hasn't any headers, which I need.
I noticed the code of the Web Query, to retrieve the headers
Data0 = Source{0}[Data]
is not working using XMLHTTP60.
Is there a way to retrieve the column headers by using MSXML2.XMLHTTP60 by adding or modifying some of code below?
Sub test()
Dim XMLRequest As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim HTMLDiv As MSHTML.IHTMLElement
Dim HTMLTable As MSHTML.IHTMLElement
Dim HTMLDiv2 As MSHTML.IHTMLElement
Dim HTMLTable2 As MSHTML.IHTMLElement
XMLRequest.Open "GET", "webSite", False
XMLRequest.send
If XMLRequest.Status <> 200 Then
MsgBox XMLRequest.Status & "-" & XMLRequest.statusText
Exit Sub
End If
HTMLDoc.body.innerHTML = XMLRequest.responseText
Set HTMLDiv = HTMLDoc.getElementById("__next")
Set HTMLTable = HTMLDiv.getElementsByTagName("table")(0)
Set HTMLDiv = HTMLDoc.getElementsByClassName("tables-container")
Set HTMLTable2 = HTMLDiv.getElementsByClassName("tables-container")(0).getElementsByClassName("time")
Debug.Print HTMLDiv.className
Call subWriteTableToWs(HTMLTable)
End Sub
I've had the same problem. The only solution I could find was to save the web page as a single file or save the HTML page and then have a cursor reading the page.
This might work if you do it for one page, but done in a loop for several pages then I am not sure it would a feasible solution.
I would really be happy to read about a solution based on XMLHTTP60
You can also find a similar issue in this post where the headers could have been scraped using XMLHTTP60 but not the table
Excel VBA IE verses XMLHTTP discrepencies