I am trying to parse the following website so I display the data like this on iOS:
Saturday 6th September
Causeway
Bond's Glen Raceway
11:00am
RO
Two Day Meeting
Two Separate Days
An example of the website:
<div id="main-column">
<h1>September</h1>
<table align="center"><col width="200"><col width="150"><col width="100"><col width="120"><col width="330"><col width="300">
<h2>Saturday 06 September</h2>
<tr id="table1">
<td><b>Club</b></td>
<td><b>Venue</b></td>
<td><b>Start Time</b></td>
<td><b>Meeting Type</b></td>
<td><b>Number of Days for Meeting</b></td>
<td><b>Notes</b></td>
</tr>
<tr id="table2">
<td>Causeway</td>
<td>Bond's Glen Raceway</td>
<td>11:00am</td>
<td>RO</td>
<td>Two Day Meeting,<br> Two Separate Days</td>
<td></td>
</tr>
<tr id="table3">
<td>West Waterford</td>
<td>Ballysaggart</td>
<td>11:00am</td>
<td>RO</td>
<td>Two Day Meeting,<br> One Meeting Over Two Days</td>
<td></td>
</tr>
So far I have managed to get all of the dates with the following code:
-(void)loadData {
NSURL *url = [NSURL URLWithString:@"http://www.national-autograss.co.uk/september.htm"];
NSData *htmlData = [NSData dataWithContentsOfURL:url];
TFHpple *htmlParser = [TFHpple hppleWithHTMLData:htmlData];
NSString *xpathQueryString = @"//h2";
NSArray *eventNodes = [htmlParser searchWithXPathQuery:xpathQueryString];
NSMutableArray *eventDates = [[NSMutableArray alloc] initWithCapacity:0];
for (TFHppleElement *element in eventNodes) {
NSString *date = [[element firstChild] content];
[eventDates addObject:date];
}
_objects = eventDates;
[self.tableView reloadData];
}
Is the Xpath query I need for the data in the table something like //table/tr/td? I tried this and I got an immediate error of adding a nil object to an array.
Or am I better to get all of the tables as separate elements and then parse individually for the data inside?
Any help, guides or ideas would be very much appreciated.
I recently gave this answer to an old hpple question.
Changing the URL to the autograss site and the query string to...
...in order to get the closest ancestor of the required text-nodes gives this log output:
but also gives
...as it retrieves all the tables.
(Please excuse colouring - the log output tabs seem to mess up blockQuotes!).
I don't whether having the text with all the clutter is that useful, but maybe it's a start. If , however, you wish to assign segments of the text to,say, array elements for some TableView then the recursion would need adapting.
Update
After looking at answers to this question, I realise that some tidying can be made by using a conditional query:
or
The first query pulls the element-nodes whereas the second pulls the text nodes themselves. The second, therefore, needs no recursive method to delve within tags but (as far as I can see) brings no further info such as parent tag.