Can someone please explain the solution to "Almost sorted intervals" to me?

3.3k Views Asked by At

Here is the problem, and here is a solution.

First part is simple enough. It's the second part that I don't get, no matter how hard I try. You basically have two sets of intervals and need to find all intersections where one interval is not completely inside another.

I have been staring at problem setter code until my eyes started to bleed. Still cannot understand this bit:

for(int L=n;L>=1;L--) {
   FOR(it, r[L]) add(*it, -1);
   add(L, 1);
   r[left[L]].push_back(L);
   ans += ask(right[L]-1);
}

How does it work? What's the algorithm?

It is noted in the editorial that you can solve this using interval tree or "Binary Indexed Tree". I understand more or less what an Interval tree is and how it can be useful. But the problem setter obviously does not use that, and "Binary Indexed Tree" does not show up in search, instead there is "Binary Indexed Tree", which deals with partial sums and I fail to see how it is relevant (it may very well be that it is relevant, but I don't understand how).

Any help? Pointers to literature I need to read?

2

There are 2 best solutions below

0
On BEST ANSWER

OK, got it. It's a binary indexed tree - http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=binaryIndexedTrees. And yes, it is relevant.

0
On

One of the problems given on the site HackerRank.com is to count the number of almost sorted intervals in a permutation of N numbers whose values range from 1 to N.

An interval of an array is defined as any contiguous non empty subset of numbers. For example, if the array is defined as { 3, 4, 1, 5, 2 } then valid intervals would include { 5 }, { 1, 5 }, { 3, 4, 1}.

An almost sorted interval of an array is any interval as described above plus the requirement that the first number is less then or equal to all other numbers in the interval and the last number is greater than or equal to all other numbers.

Using the array from above, the set of almost sorted intervals would include { 3, 4 }, {1, 5}, { 5 }, but would not include { 5, 2 }.

So the entire set of almost sorted intervals would be-

{ { 3 }, { 3, 4 }, { 4 }, { 1 }, { 1, 5 }, { 5 }, { 2 } }

The number of almost sorted intervals is therefore 7.

To meet the challenge, your solution must solve the problem in O(n * log n) time. An O(n * n) solution is rather trivial. The O(n * log n) takes more effort.

I found the problem quite difficult with my original O(n * log n) being quite messy and felt there was a better way. Searching the web really didn't help much except for some folks giving some terrible hints that really didn't help much. When I finally peeked at the "editorial" section of HackerRank, the description of a more elegant solution was hard to read. After some effort, I finally came to understand how the solution works.

Define two arrays to help solve the problem of finding all almost sorted intervals in an array:

left[i] = j where j < i and j is closest to i and a[j] > a[i].

right[i] = j where j > i and j is closest to i and a[j] < a[i].

These arrays help define when two indices i and j make up an almost sorted interval. For some i and j, a[i..j] is an almost sorted interval if j < right[i] and i > left[j].

For the array a[] = { 3, 4, 1, 5, 2 }, left[] = { -1, -1, 1, -1, 3 } and right[] = { 2, 2, 5, 4, 5 }. Note that we use -1 for the left array to express the out of bounds position to the left and the value 5 (i.e. N) to express the out of bounds position to the right.

Let us consider the interval a[i..j], where i=2 and j=3. {1, 5} is the interval. We see the following,

- j < right[i], or 3 < right[2], or 3 < 5 - i > left[j], or 2 > left[3], or 2 > -1

Which means that this interval is an almost sorted interval. Another example is, a[2] = 1; a[1] = 4. So, left[2] = 1.

One interesting thing to note is that once we have defined left[] and right[] we have "encoded" the numbers and there relationship to each other allowing us now to solve the problem only dealing with the indices of the arrays and no longer the numbers that make up the array.

The left[] and right[] arrays can both be computed in O(n) time.

We can then use these arrays to efficiently count the total number of almost sorted intervals. We iterate from right to left across the indices. While we iterate over the array, we can keep a set B made up of all possible end indices for all intervals that start at or to the left of the current index.

This can be done by adding the index value i to the set B at index i and removing index value i at index left[i] (which will always be some index to the left of i). Maintaining the set B can be done in O(1) time.

For each index, we can then check how many indices in the B set would be a valid end index if the current index is the start index.

At index i, an index j would be in the set B only if i > left[j]. The interval { a[i] ... a[j] } is an almost sorted interval if j < right[i]. We can count how many indices in the set B are less then right[i] to know how many almost sorted intervals index position i contributes to the total number of almost sorted intervals (as a left index position of the interval). If we then accumulate these values across all indices we can find the total number of almost sorted intervals.

This only leaves how we can count the number of indices in B which are less then right[i] efficiently. This can be done using the Binary Index tree in O(log n) time.

Thus the total runtime will be O(n * log n).