Get table data from html using hpple

Question

Get table data from html using hpple

710 Views Asked by Harg At 24 October 2014 at 10:35

I am trying to parse the following website so I display the data like this on iOS:

Saturday 6th September

Causeway
Bond's Glen Raceway
11:00am
RO
Two Day Meeting
Two Separate Days

An example of the website:

    <div id="main-column">
<h1>September</h1>
    <table align="center"><col width="200"><col width="150"><col width="100"><col width="120"><col width="330"><col width="300">
        <h2>Saturday 06 September</h2>
        <tr id="table1">
            <td><b>Club</b></td>
            <td><b>Venue</b></td>
            <td><b>Start Time</b></td>
            <td><b>Meeting Type</b></td>
            <td><b>Number of Days for Meeting</b></td>
            <td><b>Notes</b></td>
        </tr>
        <tr id="table2">
            <td>Causeway</td>
            <td>Bond's Glen Raceway</td>
            <td>11:00am</td>
            <td>RO</td>
            <td>Two Day Meeting,<br> Two Separate Days</td>
            <td></td>
        </tr>
        <tr id="table3">
            <td>West Waterford</td>
            <td>Ballysaggart</td>
            <td>11:00am</td>
            <td>RO</td>
            <td>Two Day Meeting,<br> One Meeting Over Two Days</td>
            <td></td>
        </tr>

So far I have managed to get all of the dates with the following code:

    -(void)loadData {

NSURL *url = [NSURL URLWithString:@"http://www.national-autograss.co.uk/september.htm"];
NSData *htmlData = [NSData dataWithContentsOfURL:url];


TFHpple *htmlParser = [TFHpple hppleWithHTMLData:htmlData];


NSString *xpathQueryString = @"//h2";
NSArray *eventNodes = [htmlParser searchWithXPathQuery:xpathQueryString];



NSMutableArray *eventDates = [[NSMutableArray alloc] initWithCapacity:0];
for (TFHppleElement *element in eventNodes) {

    NSString *date = [[element firstChild] content];
    [eventDates addObject:date];

}

_objects = eventDates;
[self.tableView reloadData];

}

Is the Xpath query I need for the data in the table something like //table/tr/td? I tried this and I got an immediate error of adding a nil object to an array.

Or am I better to get all of the tables as separate elements and then parse individually for the data inside?

Any help, guides or ideas would be very much appreciated.

Original Q&A

There are 1 best solutions below

**cate** · Accepted Answer · 2014-10-28T12:06:15.287000

I recently gave this answer to an old hpple question.

Changing the URL to the autograss site and the query string to...

NSString *queryString = @"//table";

...in order to get the closest ancestor of the required text-nodes gives this log output:

2014-10-28 11:52:02.416 SiteSearcher[28314:613] theText:

          Saturday 06 September

              Club
                Venue
                Start Time
                Meeting Type
                Number of Days for Meeting
                Notes


                Causeway
                Bond's Glen Raceway
                11:00am
                RO
                Two Day Meeting, Two Separate Days



                West Waterford
                Ballysaggart
                11:00am
                RO
                Two Day Meeting, One Meeting Over Two Days


            Sunday 07 September...

but also gives

        ...2014 Fixtures:
            January
            February
            March
            April
            May
            June


            2014 Fixtures Cont'd:
            July
            August
            September
            October
            November
            December


            Official Details:
            Regisitered Address:
                46 Brookside, Alconbury,
                Huntingdonshire, PE28 2EP.

...as it retrieves all the tables.

(Please excuse colouring - the log output tabs seem to mess up blockQuotes!).

I don't whether having the text with all the clutter is that useful, but maybe it's a start. If , however, you wish to assign segments of the text to,say, array elements for some TableView then the recursion would need adapting.

Update

After looking at answers to this question, I realise that some tidying can be made by using a conditional query:

NSString *xPathQueryString = @"//tr[not(@id='table1')]|//h2";

or

NSString *xPathQueryString = @"//h2/text()|//tr[not(@id='table1')]//td/text()";

The first query pulls the element-nodes whereas the second pulls the text nodes themselves. The second, therefore, needs no recursive method to delve within tags but (as far as I can see) brings no further info such as parent tag.

Get table data from html using hpple

There are 1 best solutions below

Related Questions in IOS

Related Questions in OBJECTIVE-C

Related Questions in XPATH

Related Questions in HPPLE

Trending Questions

Popular # Hahtags

Popular Questions