(R) [] / subset() returns an empty data frame

2.2k Views Asked by At

I have a large dataset that looks something like this with a few hundred thousand more entries, saved as data:

        Group1      dtm_Flight_Date       Departure Arrival str_Fare_Category_Ident 
        1   8P104   06/11/2010 9:05         YYJ     YVR     B   
        2   8P104   06/11/2010 9:05         YYJ     YVR     K  
        3   8P104   06/11/2010 9:05         YYJ     YVR     L   
        4   8P104   06/11/2010 9:05         YYJ     YVR     N   
        5   8P104   06/11/2010 9:05         YYJ     YVR     Q  
        6   8P104   06/11/2010 9:05         YYJ     YVR     Y  
        7   8P104   6/14/2010 9:05:00 AM    YYJ     YVR     B  
        8   8P104   6/14/2010 9:05:00 AM    YYJ     YVR     K  
        9   8P104   6/14/2010 9:05:00 AM    YYJ     YVR     L   
        10  8P104   6/14/2010 9:05:00 AM    YYJ     YVR     N  

Now, what I want to do is subset the data based on 'str_Fare_Category_Ident', particularly where it is equal to Y. While I think this should be a simple task I've done before, I am having some trouble.

I have tried

     public_bc <- data[data[, 5]=="Y", ]

but this just returns an empty data frame. Also tried:

     public_bc <- data[data$str_Fare_Category_Ident=="Y", ]

Same problem.

I tried to use subset(), but also to no avail:

    public_bc <- subset(data, data[, 5]=="Y")

Also returns an empty data frame.

str_Fare_Category_Ident is currently a factor, but I've also tried changing it to as.character() with no change.

1

There are 1 best solutions below

0
On BEST ANSWER

If there are lagging/leading spaces, this could occur. Remove those and it should work.

 library(stringr)
 data[,5] <- str_trim(data[,5])

Or

 data[,5] <- gsub('^\\s+|\\s+$', '', data[,5])     
 data[data[,5]=='Y',]

Another option without removing the spaces would be grep

 data[grep('\\bY\\b', data[,5]),]