R - function to extract value according to rank

152 Views Asked by At

Suppose I have a data frame that I have ordered according to rate such that it now looks something like this:

Name    Rate  
A        10     
D        11     
C        11     
E        12     
B        13     
F        14     

I am trying to write a function that takes a rank value as an argument (e.g. rank = 2) and outputs the corresponding names, such that if there are ties in ranks, it would output the name that comes first alphabetically.

In this case, the data should look something like this:

 Name    Rate  Rank
A        10     1
C        11     2
D        11     3
E        12     4
B        13     5
F        NA     6

so that rank=2 would output "C" (not D) and rank = 5 would output "B"

Suppose that the function's rank input is called "num", this is what I've tried to do:

    rankName <- df[!is.na(df[,2]),]
    rankName <- sort(rankName[,2],) #sorting according to Rate
    rank<-seq(1,length(rankName),by=1) #creating a sequence for rank
    rankName <- cbind(rankHosp,rank) #combining rankName & rank seq.
    comp <- rankName[rankName[,3]==num,] #finding rate value where rank = num
rankName <- rankName[rankName[,2]==comp,] #finding rows where rates are
                                          #equal at that rank
    rankName<-rankName$Name #extracting by Name

        if (length(rankName)>1){
                rankName <- sort(rankName)
                rankName <- rankName[1]
        }

I'm getting the following error:

Error in `[.data.frame`(rankName, , 3) : undefined columns selected 

I'm assuming that, regardless of my error, there's a significantly simpler way to accomplish this, but I haven't been able to figure it out.

Any advice is appreciated. Thank you!

1

There are 1 best solutions below

0
On

One way of doing this would be to use base::rank() and then using grouping functionality provided by packages like dplyr

df<- read.table(header = T, text = "Name    Rate  
A        10     
D        11     
C        11     
E        12     
B        13     
F        14")

df$rnk<- rank(df$Rate, na.last = T,ties.method = "average")    
df    

require(dplyr)
finaldf<- df %>%  group_by(rnk) %>% mutate(Rank=floor(rnk)+ order(Name)-1) %>% 
  as.data.frame %>% select(c(Name,Rate,Rank))     

finaldf

first rnk is created using average, so we group_by by using these averages that will be 2.5 for names D and C