TreeSet Comparator

796 Views Asked by At

I have a TreeSet and a custom comparator. I get the values from server according to the changes in the stock

ex: if time=0 then server will send all the entries on the stock (unsorted) if time=200 then server will send entries added or deleted after the time 200(unsorted)

In client side i am sorting the entries. My question is which is more efficient

1> fetch all entries first and then call addAll method or 2> add one by one

there can be millions of entries.

/////////updated///////////////////////////////////

  private static Map<Integer, KeywordInfo> hashMap = new HashMap<Integer, KeywordInfo>();
  private static Set<Integer> sortedSet = new TreeSet<Integer>(comparator);

      private static final Comparator<Integer> comparator = new Comparator<Integer>() {
        public int compare(Integer o1, Integer o2) {
          int integerCompareValue = o1.compareTo(o2);
          if (integerCompareValue == 0) return integerCompareValue;
          KeywordInfo k1 = hashMap.get(o1);
          KeywordInfo k2 = hashMap.get(o2);
          if (null == k1.getKeyword()) {
            if (null == k2.getKeyword())
              return integerCompareValue;
            else
              return -1;
          } else {
            if (null == k2.getKeyword())
              return 1;
            else {
              int compareString = AlphaNumericCmp.COMPARATOR.compare(k1.getKeyword().toLowerCase(), k2.getKeyword().toLowerCase());
              //int compareString =  k1.getKeyword().compareTo(k2.getKeyword());
              if (compareString == 0)
                return integerCompareValue;
              return compareString;
            }
          }
        }
      };

now there is an event handler which gives me an ArrayList of updated entries, after adding them to my hashMap i am calling

final Map<Integer, KeywordInfo> mapToReturn = new SubMap<Integer, KeywordInfo>(sortedSet, hashMap);
3

There are 3 best solutions below

6
On

The actual implementation is a linked list, so add one by one will be faster if you do it right. And i think in the near future this behaviour wont be change.

For your problem a Statefull comparator may help.

// snipplet, must not work fine
public class NaturalComparator implements Comparator{
    private boolean anarchy = false;
    private Comparator parentComparator;

    NaturalComparator(Comparator parent){
       this.parentComparator = parent;
    }
    public void setAnarchy(){...}
    public int compare(A a, A b){
      if(anarchy) return 1
      else return parentCoparator.compare(a,b);
    }
}
...
Set<Integer> sortedSet = new TreeSet<Integer>(new NaturalComparator(comparator));
comparator.setAnarchy(true);
sortedSet.addAll(sorted);
comparator.setAnarchy(false);
3
On

I think your bottleneck can be probably more network-related than CPU related. A bulk operation fetching all the new entries at once would be more network efficient.

With regards to your CPU, the time required to populate a TreeSet does not change consistently between multiple add()s and addAll(). The reason behind is that TreeSet relies on AbstractCollection's addAll() (http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/AbstractCollection.java#AbstractCollection.addAll%28java.util.Collection%29) which in turn creates an iterator and calls multiple times add().

So, my advice on the CPU side is: choose the way that keeps your code cleaner and more readable. This is probably obtained through addAll().

0
On

In general it is less memory overhead when on being loaded alread data is stored. This should be time efficient too, maybe using small buffers. Memory allocation costs time too.

However time both solutions, in a separate prototype. You really have to test with huge numbers, as network traffic costs much too. That is a bit Test Driven Development, and adds to QA both quantitative statistics, as correctness of implementation.