I have to carry out a challenge that involves the elaboration of an algorithm to compute the Pareto (set) boundary. The statement is basically:
Given a set S of n points in the square [0,1] x [0,1], make an algorithm to determine the subset P contained in S, formed by the non-dominated points of S.
It is also said that it is easy to elaborate an algorithm of the order n*n point comparisons that accomplish this purpose. Well I came up with an algorithm by researching here and there. The challenge is still to implement an algorithm of the order n*log(n). How do I get the order of these algorithms?
Thanks in advance!
#data
set.seed(103)
x = runif(200)
y = runif(200)
#algorithm
pareto = 1:length(x)
for(i in 1:length(x)){
cond1 = y[i]!=min(y[which(x==x[i])])
cond2 = x[i]!=min(x[which(y==y[i])])
for(k in 1:length(x)){
if((x[i]>x[k] & y[i]>y[k]) | (x[i]==x[k] & cond1) | (y[i]==y[k] & cond2)){
pareto[i] = NA
break
}
}
}
xPareto = x[pareto[!is.na(pareto)]]
yPareto = y[pareto[!is.na(pareto)]]
#graphic:
plot(x, y)
points(xPareto, yPareto, col=2, pch=16)
dat = data.frame(x=xPareto, y=yPareto)
dat = dat[order(dat$x),]
for(i in 1:(nrow(dat)-1)){
segments(dat$x[i], dat$y[i], dat$x[i+1], dat$y[i+1], col=2, lty=2)
}

The intuition behind the efficient greedy solution to this problem lies in the fact that a point
iis dominated by pointjiffx[i] > x[j]andy[i] > y[j], which implies thatjmust come beforeiwhen the points are ordered by either coordinate. Hence, if we traverse the points in increasing order of their x-coordinates, then the pointj(if any) that dominates pointimust have been traversed before pointiis traversed. In other words, it is impossible for the dominating pointjto come after the dominated pointiin this ordering.Thus, with this traversal order the domination problem (i.e. checking if a point is dominated by some other point) boils down to checking if we have already seen a point with a lower y-coordinate as the traversal order already enforces the x-coordinate condition. This can easily be done by checking each point's y-coordinate to the lowest (minimum) y-coordinate we have seen so far -- if the minimum y-coordinate is less than the current point
i's y-coordinate then the pointjwith the minimum y-coordinate dominatesiasx[j] < x[i]becausejwas seen beforei.Sorting by the x-coordinate takes
O(n log n)time and checking each point (while maintaining the partial minimum y-coordinate) takesO(n)time, making the entire algorithm takeO(n log n)time.