I have a dataframe that I split into a list of dataframes based on a categorical variable in the dataframe:
list <- split(mpg, mpg$manufacturer)
I want to filter the list to only include dataframes where one of the categorical columns in each dataframe contain at least 5 unique factors, and remove those with less than 5.
I have tried lapply and filter over the dataset, but the result is filtering each dataframe, not the list entirely, as well as:
filteredlist <- lapply(list, function(x) length(unique(x$class) >= 5))
and am stumped.
Thanks, Any help would be appreciated!
First let's take a look at how many unique classes there are:
So, with this data, the
>= 5isn't a great example because it will have 0 results. Let's do>= 3so we can expect a non-empty result.Note that in your attempt
lapply(list, function(x) length(unique(x$class) >= 5))you have a parentheses typo, you wantlength(unique()) >= 5)notlength(unique(...) >= 5))