How to filter out rows by a set of names you already have?

153 Views Asked by At

I have a tibble similar to:

tibble(
x = c("christmas", "christmas", "car", "dog")
y = c("one","two","three", "four")
)

and then I have another tibble like:

tibble(
x = c("christmas", "dog")
)

Notice the two christmas' that are in the first tibble. I want to use the second tibble's column to output new columns from the first:

tibble(
x = c("christmas","christmas", "dog")
y = c("one","two","four")
)
2

There are 2 best solutions below

0
On

Try this base R solution using %in% and indexing:

#Code
df1[df1$x %in% df2$x,]

Output:

# A tibble: 3 x 2
  x         y    
  <chr>     <chr>
1 christmas one  
2 christmas two  
3 dog       four 

Some data used:

#Data 1
df1 <- structure(list(x = c("christmas", "christmas", "car", "dog"), 
    y = c("one", "two", "three", "four")), row.names = c(NA, 
-4L), class = c("tbl_df", "tbl", "data.frame"))

#Data 2
df2 <- structure(list(x = c("christmas", "dog")), row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame"))
0
On

If you are comfortable with SQL terminology:

library(dplyr)
> df1 %>% inner_join(df2)
Joining, by = "x"
# A tibble: 3 x 2
  x         y    
  <chr>     <chr>
1 christmas one  
2 christmas two  
3 dog       four 
> 

Using base R

> merge(df1, df2)
          x    y
1 christmas  one
2 christmas  two
3       dog four
>