Arrange complex datatype for fast partial match comparison

87 Views Asked by At

I am writing a java program that compares two datasets, each set contains data of the same type. The datatypes are basically classes, containing both Strings, ints and a String[]. Let's call this class Foo and the datasets a and b. For each item in a, I need to find the item in b that matches it most closely.

My problem is speed - I have outlined below, in pseudo-code, what I do right now. As you can imagine, it doesn't scale very well with increasing size (and I DO have much increasing sizes...). If anyone could point me in the direction of a better solution, I would greatly appreciate it. I am aware that sorting the arrays, in case of e.g. String or int comparisons, would increase speed vastly, but since my datatype is more complex, I don't see how that could work here.

Foo[] a = new Foo[...];
Foo[] b = new Foo[...];
for (item_a : a) {
   double bestMatch = 0;
   for (item_b : b) {
      double match = compareFoo(item_a,item_b);
      if (match > bestMatch) {
          bestMatch = match;
      }
   }
   //Do stuff with bestMatch - display, save etc.
}

private double compareFoo(Foo item_a, Foo item_b) {
   //Compare every element of a and b, 
   //return value between 0 (no match) and 1 (identical)
}
0

There are 0 best solutions below