This site has lots of questions on how to fix an "undefined column" error.
I have the exact opposite question: how to make an "undefined column" error.
I frequently change variable names in my files.
This leads to the following error:
r$> df <- data.frame(gender=c(1,1,NA,0))
r$> sum(is.na(df$male))
[1] 0
when the correct result is 1.
I want R to print an error message if the column I'm trying to access is undefined.
Not to silently fail.
How can I do that?
Unfortunately R is rather too lenient when it comes to such matters. The
$operator for data.frames is defined to allow accessing non-existent columns and to returnNULLin that case.There are alternative data.frame implementations which are a bit stricter. Notably, the
tbl_dfdata structure used by the Tidyverse packages ‘tibble’, ‘dplyr’, etc. will at least show you a warning:Alternatively, you can make this a hard error for data.frames by overriding
$for data.frames:However, note that this will only apply to plain
data.frame, not to tibbles, since the latter also override$. There does not seem to be an option to make this a hard error for tibbles (short of making all warnings into errors); this might be a nice feature request for the package (alternatively, you can make the above code apply to tibbles by replacing'data.frame'with'tbl_df).