I am trying to program a Webcrawler, using Visual Basic. I have a list with links, stored in an Excel (column 1). The Macro should then open each link and add certain information from the website to the excel file. Here's the first link (stored in field A2).
The Macro should identify and insert the name of the hotel into column 2 (B2), the rating in column 3 (C2) and the address in column 4 (D2). This process could then be repeated with a loop for all other links (all websites have the same structure).
My code so far (I did not add the loop yet):
Sub Hoteldetails()
Dim IEexp As Object
Set IEexp = CreateObject("InternetExplorer.Application")
IEexp.Visible = True
Range("A2").Select
Selection.Hyperlinks(1).Follow NewWindow:=False, AddHistory:=True
End Sub
How can I "select" the specific data I want and insert it into the excel file? I tried to record the macro via "Add Data", but was not able to import the data from the website. I also tried to do it by using various example codes, but it did not work out for my specific website.
Thanks a lot for any assistance!
You have several options:
Option 1: IEObject
Either you need to use the
getElementBy
methods in IEObject and use string manipulation to extract the data you need. 2 options for string extractions:Name
or byId
then use string manipulation functions such asMid
,InStr
,Left
andRight
Option 2: Scrape HTML Add-In
Sometime ago I developed an AddIn for Excel that allows you to easily scrape HTML data within an Excel formula. The process is similar as above as you still need to create a relevant Regex. See an example below for TripAdvisor:
The formula in
B2
looks like this (A2 is the link, and the second argument is the Regex):You can download the AddIn here: http://www.analystcave.com/excel-tools/excel-scrape-html-add/