Is it possible to get only all the text content of the child elements recursively in hpple. Any method in TFHppleElement class?
such as the javascript
document.getElementById("testdiv").textContent
Is it possible to get only all the text content of the child elements recursively in hpple. Any method in TFHppleElement class?
such as the javascript
document.getElementById("testdiv").textContent
On
I wanted something like this - a quick boiler plate code, it is not an elegant solution with static contents. Please let me know, how can this be improved :)
#pragma mark - Hpple XML parser
/* The documents contents lots of nested div, table, span, style etc. */
- (NSString *) extractDefinition
{
NSString *html = [self.webView stringByEvaluatingJavaScriptFromString: @"document.getElementById('innerframe').innerHTML"];
if ([Resources stringIsEmpty:html]) {
return nil;
}
return [self extractSubDiv:html];
}
- (NSString *)extractSubDiv:(NSString *)html
{
TFHpple *hppleParser = [TFHpple hppleWithHTMLData:[html dataUsingEncoding:NSUTF8StringEncoding]];
NSString * xpathQuery;
xpathQuery = @"//div[@id='columnboth']";
NSArray * defNodes = [hppleParser searchWithXPathQuery:xpathQuery];
NSString * text = nil;
if ([defNodes count] > 0) {
TFHppleElement * element = [defNodes objectAtIndex:0];
text = [self parseContents:element];
} else {
xpathQuery = @"//div[@id='columnsingle']";
defNodes = [hppleParser searchWithXPathQuery:xpathQuery];
if ([defNodes count] > 0) {
TFHppleElement * element = [defNodes objectAtIndex:0];
text = [self parseContents:element];
}
}
return text;
}
- (NSString *) parseContents:(TFHppleElement *)element {
NSArray * innhold = [element searchWithXPathQuery:@"//div[contains(@class,'articlecontents')]"];
return [self getTextFromArray:innhold];
}
static NSMutableString * contents;
- (NSString *) getTextFromArray:(NSArray *)hppleElments {
NSMutableString * text = [[NSMutableString new] autorelease];
contents = nil;
contents = [[NSMutableString new] autorelease];
for (TFHppleElement * e in hppleElments) {
[text appendFormat:@"%@ ", [self getText:e]];
}
return text;
}
/* Here are more nested div and then span for text. */
- (NSString *) getText:(TFHppleElement *)element
{
if ([element isTextNode]) {
[contents appendFormat:@" %@", element.content];
}
for (TFHppleElement * e in element.children) {
[self getText:e];
}
return contents;
}
I'm using this code to get all content of the news title
you can use