Internal working of Java Collections Framework

2.6k Views Asked by At

I've had this question for a very very long time.

Question is a bit long. Please bear with me :)

In Summary, how does a collection datastructure such as TreeSet know when the underlying data it stores gets modified and how does it manage such cases??

Example

//Simple person class with name data member
public static class Person {
    String name;

    public Person(String name) {
        this.name = name;
    }
}


1. Create a TreeSet and add 3 Person instances p1, p2, p3. (Comparator sorts name).

TreeSet<Person> set = new TreeSet<>(new Comparator<Person>() {
    @Override
    public int compare(Person person1, Person person2) {
        return person1.name.compareTo(person2.name);
    }
});

// Creating 3 Person instances and adding to set.
Person p1 = new Person("Zach"),
        p2 = new Person("Henry"),
        p3 = new Person("Adam");

// Adding to set
set.add(p1);  set.add(p2);  set.add(p3);


2. Printing the First Element (Prints the lexicographical smallest string in balanced BST)

// This will name of P3 instance, i.e. Adam (obvious and expected) 
System.out.println(set.first().name);
// "Adam" is printed which is expected.


3. Modifying the Person Instance P3 to have "zebra" name. i.e. Adam -> Zebra

p3.name = "Zebra";

System.out.println(set.first().name);


QUESTION
In Section 3, I modified p3 instance to hold "Zebra" instead of "Adam".
Question is, How did TreeSet know that the P3 instance has been modified???

TreeSet is built using some Balanced BST (RB Trees usually). Hence, when I change some data, It has to reorder the internal nodes of the Tree to maintain adherence with the rules of the Comparator.
So, how did the TreeSet get notified that the underlying data has been modified and that it has to reorder tree nodes again???

I really want to know how does it work internally. Is it observer pattern? Requesting a bit elaborate and comprehensive answers :)

1

There are 1 best solutions below

2
On

How does a collection datastructure such as TreeMap know when the underlying data it stores gets modified and how does it manage such cases?

It doesn't. For values it doesn't matter, but if you modify keys it can corrupt the whole collection. This is why you should prefer immutable keys for maps, or at least make sure they're not modified after being used as keys.

Note also that TreeSet is backed by a TreeMap, and HashSet is backed by a HashMap, so the same goes with them (with the set values being the map keys).