First off, I have no control over the text I am getting. Just wanted to put that out there so you know that I can't change the links.
The text I am trying to find links in using NSDataDetector
contains the following:
<h1>My main item</h1>
<img src="http://www.blah.com/My First Image Here.jpg">
<h2>Some extra data</h2>
The detection code I am using is this, but it will not find this link:
NSDataDetector *linkDetector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:nil];
NSArray *matches = [linkDetector matchesInString:myHTML options:0 range:NSMakeRange(0, [myHTML length])];
for (NSTextCheckingResult *match in matches)
{
if ([match resultType] == NSTextCheckingTypeLink)
{
NSURL *url = [match URL];
// does some stuff
}
}
Is this a bug with Apple's link detection here, where it can't detect links with spaces, or am I doing something wrong?
Does anyone have a more reliable way to detect links regardless of whether they have spaces or special characters or whatever in them?
You should not use NSDataDetector with HTML. It is intended for parsing normal text (entered by an user), not computer-generated data (in fact, it has many heuristics to actually make sure it does not detect computer-generated things which are probably not relevant to the user).
If your string is HTML, then you should use an HTML parsing library. There are a number of open-source kits to help you do that. Then just grab the href attributes of your anchors, or run NSDataDetector on the text nodes to find things not marked up without polluting the string with tags.