What is the efficient way to search trough NSArray

192 Views Asked by At

I have NSStrings in my array:

i[0] = axxx
i[1] = axyz
i[2] = axxy
i[3] = abcd

I want to pass a search string to find all needed strings. For example if I pass "ax" then it will return 3 strings, if I pass "axx" then it will return 2 string.

Performance is critical here as well. The method should look like this:

- (NSArray *)searchString:(NSString *)search; 

Noramlly I use NSPredicate, but this time I need to use maybe Prefix Tree or Binary Tree, I am not sure but it should be faster. Any suggestion or links to implementation.

3

There are 3 best solutions below

6
On BEST ANSWER

Hope this solution will satisfy you.

- (NSArray *)searchString:(NSString *)search{

    NSIndexSet *indexes = [dataArray indexesOfObjectsPassingTest:
                           ^BOOL (id obj, NSUInteger i, BOOL *stop) {
                               NSString *myObj = obj;
                               return [myObj containsString:search];
                           }];
    NSArray *results = [dataArray objectsAtIndexes:indexes];

    return results;

}
1
On

There is essential information missing. If you look for "axx", do you expect "haxx" to be in your results? "HaXX"? "Axxyyyz"? "äxx"? How many strings do you have? 10? 100? 1000? 100,000? How often do you do this search? How often does the array change?

First step is to figure out which NSString method will match the strings you want to match. Second step is to implement using brute force and measuring (predicates are usually several times slower than looping through the array). Third step is to figure out whether sorting the data can help.

3
On

This is a fairly simple problem.

As Avi suggests in his comment, there's 2 parts to it: The method you use for matching, and the method you use for searching for those matches.

If your array is sorted and you're looking for a single, perfect match, you can use a binary search. I believe that will give you O(log(n)) performance. (Time goes up with the log of the number of elements.)

However, you're not looking for a single, perfect match. You're looking for partial matches. If they always must match the beginning of the string then you still might be able to use a binary search to find the first match, and then search linearly up and down in the array until the first non-match. That will give you slightly worse than O(log(n)) performance, but not as bad as O(n).

If you are matching your substring anywhere inside the entries I think you're going to have to test every element in the array. You will simply need to test every element, giving you O(n) performance.

Note that O(n) performance is generally considered good. It scales well for large data-sets. (You want to avoid O(n^2) performance. That's what kills you.)

The second part of the problem is matching speed. You can probably get slightly better performance than a predicate by writing your own string matching routine that is hard-coded for your exact match criteria, but the performance gains are likely to be modest. You would have to give more details about what constitutes a match in order for us to help with this part.