How to eliminate data (years) from csv using R

33 Views Asked by At

I am working on a CSV document using R Studio. I have data from 2008 to 2018, but I want to eliminate 2008, 2009 and 2018 from the dataframe so see how the results change. I tried the following code but it's not working:

base.final <- merge(x=base.final, y=base.dividendos, by = c('Ativo',"Data"), 
                    all.x = TRUE, all.y = FALSE)

table(is.na(base.final$SETORES_NEFIN))/12

#eliminating years from the data base
base.final <- base.final[base.final$Data!="2008","2009", "2018",]

Error that comes up: [.data.frame(base.final, base.final$Data != "2008", "2009", : unused argument (alist())

This is a database from stocks, and I need to remove some tickers from the data base as well. Which code should I use? I'm working with dplyr.

I tried changing the last comma but it didn't worked out. I am still learning the language.

1

There are 1 best solutions below

0
On

as per TheN's comment. You must enclose your tested years as a vector in c() for it to work, and further,the != is a logical comparator that checks if one vector is the same as another. What you want to check is whether the years you give in the vector is somewhere in Data.

The difference between ==/!= and %in% can be a bit challenging to understand, but https://stackoverflow.com/a/42637469/4574789 explains it excellently.

It is nice to include a reproducible example when asking for help, so that those that helps can actually write code that have a chance of working in your user case.

I have NO idea what your data contains or the structure of it, but suppose:

base.final <- data.frame(SETORES_NEFIN = c("Guaraná", 
                                           "Rubberduck", 
                                           "Mexico cartel", 
                                           "Sugartown"),
                     Data = c("2008", "2009", "2010", "2018"))

# A subset of the data excluding some years
subset(base.final, !Data %in% c("2010", "2008))