Query yellowpages.com to return street addresses

732 Views Asked by At

I am trying to take a list of names and zip codes from Excel, input them sequentially one name and zip code at a time into the search fields at www.yellowpages.com and return street address results to Excel in the same sequence as the original names and zip codes. There is no error message returned, it just stops without finishing. I'm not sure where it stops but it does open internet explorer, enter search terms and click search because I can see that when .visible = True. My best guess is between the "".

Here is my code (adapted from DontFretBrett and Dinesh Kumar Takyar):

Sub Address_Scrape()
    Dim eRow As Long
    Dim ele As Object
    Dim wb As Workbook
    Dim srch As Worksheet
    Dim trgt As Worksheet
    Set wb = ThisWorkbook
    Set srch = wb.Sheets("Master with addresses")
    Set trgt = wb.Sheets("Sheet1")
    Dim url As String
    Dim zc As String
    Dim Name As String

Name = srch.Range("B2")
zc = srch.Range("F2")
url = "URL;http://www.yellowpages.com/"
url = url & "/" & zc & "/" & Name
RowCount = 1
trgt.Range("A" & RowCount) = "Name"
trgt.Range("B" & RowCount) = "Address"
trgt.Range("C" & RowCount) = "City"
trgt.Range("D" & RowCount) = "State"
trgt.Range("E" & RowCount) = "Zip"
eRow = Sheet1.Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
Set objIE = CreateObject("InternetExplorer.Application")
    With objIE
    .navigate "http://www.yellowpages.com/"
    .Visible = True
    Do While .Busy Or _
    .readyState <> 4
    DoEvents
    Loop
Set who = .document.getElementsByName("search_terms")
who.Item(0).Value = Name
Set where = .document.getElementsByName("geo_location_terms")
where.Item(0).Value = zc
.document.forms(0).submit
    Do While .Busy Or _
    .readyState <> 4
    DoEvents
    Loop
"Results = .document.getElementsByTagName("p")(0).innerText"
    For Each ele In .document.all
        Select Case ele.tagName
        Case Results
        RowCount = RowCount + 1
        Case "Name"
        trgt.Range("A" & RowCount) = ele.getElementByclass("business-name").innerText
        Case "Address"
        trgt.Range("B" & RowCount) = ele.getElementByclass("street-address").innerText
        Case "City"
        trgt.Range("C" & RowCount) = Trim(ele.getElementByclass("locality").innerText)
        Case "State"
        trgt.Range("D" & RowCount) = ele.getElementByitemprop("addressRegion").innerText
        Case "Zip"
        trgt.Range("E" & RowCount) = ele.getElementByitemprop("postalCode").innerText
        End Select
    Next ele
Set objIE = Nothing
End With
End Sub
1

There are 1 best solutions below

4
On

You want to basically scrape the data from a yellow pages search.

I made a useful Excel add-in some time ago to do such finds without resorting to VBA: http://blog.tkacprow.pl/excel-scrape-html-add/

Let us start from the beginning the GET URL structure is:

http://www.yellowpages.com/search?search_terms=[SEARCH_TERM]&geo_location_terms=[LOCATION]

Where [SEARCH_TERM] and [LOCATION] are your GET parameters.

Now, say using the functions in the add-in you would want to get the text of an element with a class name "business-name" use this function:

=GetElementByRegex("http://www.yellowpages.com/search?search_terms=[SEARCH_TERM]&geo_location_terms=[LOCATION]"; "class=""business-name""[^<>]*?>((?:.|\n)*?)<[^<>]*?/")

No VBA just regular expressions. Simply replace the GET parameters with your own. Of course in case of different element the regex may vary - but it is still more simple than writing VBA from scratch.

Hopes this helps.