R Problems with POSIXlt and mutate within a function

114 Views Asked by At

I´m working on project with n data frames. Before i can work with them some processing is nescessary. One of these operations is to add a column for "year", "month" and "day" to each data frame. The also got and keep a column "date" in the format "yyyy - mm - dd".

dateiNamen<-dir(path="input_folder",pattern = "merged", all.files = T)

Auswertung<-function(Dateiname){
  setwd("input_folder")
  data<-read.delim(Dateiname, stringsAsFactors = FALSE)
  data$date<-as.Date(data$date)
  data<-subset(data, data$date>"1994-12-31")
  data<-data[,colMeans(is.na(data))<0.15]
  
  Ausgabedatei <- paste0("output_folder",Dateiname, "_bearbeitet",".csv")
  write.table(data, file = Ausgabedatei, row.names = F, col.names = T, sep=",")
  setwd("Project_folder")
}

fun<-lapply(dateiNamen, Auswertung)

That´s the first step and it works as it should, after this the "date" column (XX$date) of any of these data frames got the following structure:

head(XX$date)
[1] "1995-01-01" "1995-01-02" "1995-01-03" "1995-01-04" "1995-01-05" "1995-01-06"

str(XX$date)
 Date[1:9131], format: "1995-01-01" "1995-01-02" "1995-01-03" "1995-01-04" "1995-01-05" "1995-01-06" "1995-01-07" "1995-01-08" "1995-01-09" "1995-01-10" ...

class(XX$date)
[1] "Date"

dput(head(XX$date))
structure(c(9131, 9132, 9133, 9134, 9135, 9136), class = "Date")

Now, i apply my second function on the data frames to create the colums for year, month and day in each data frame:

dateiNamen2<-dir(path="output_folder",pattern = "merged", all.files = T)

Spalten<-function(Dateiname){
  setwd("output_folder")
  data<-read.csv(Dateiname,header = TRUE, ";")
  
  
  data<-data%>%
        dplyr::mutate(     year = lubridate::year(date), 
                                 month = lubridate::month(date),
                                 day = lubridate::day(date))
  
  Ausgabedatei <- paste0("output_folder_2",Dateiname, "_bearbeitet",".csv")
  write.table(data, file = Ausgabedatei, row.names = F, col.names = T, sep = ",")
  setwd("Project_folder")
}

fun2<-lapply(dateiNamen2, Spalten)

When i apply this function i get the following error:

**Error in as.POSIXlt.default(x, tz=tz(x)): 
dont know how to convert "x" in class ""POSIXlt""**

I cant figure it out what is causing this problem, cause if i´m running the same commands on a single file it works perfectly. Maybe one of you sees something i missed.

And yeah the functions could be "prettier", but they are the first functions i ever wrote. i´ll keep trying! :)

1

There are 1 best solutions below

2
On

Can you try this function :

Spalten<-function(Dateiname){

  data <- read.csv2(Dateiname) #.....(1)
  data <- data %>%
            dplyr::mutate(date = as.Date(date),            #.....(2)
                          year = lubridate::year(date), 
                          month = lubridate::month(date),
                          day = lubridate::day(date))
  
  Ausgabedatei <- paste0("output_folder_2/",Dateiname, "_bearbeitet",".csv") #.....(3)
  write.table(data, file = Ausgabedatei, row.names = FALSE, col.names = TRUE, sep = ",")
}

dateiNamen2 <- list.files(path="output_folder",pattern = "merged", 
                          all.files = TRUE, full.names = TRUE) #.....(4)
fun2 <- lapply(dateiNamen2, Spalten)
  1. Used read.csv2 since it has default sep = ";" and header is always TRUE for read.csv functions.
  2. I don't think the date would automatically read into "Date" class if we don't explicitly mention the class in read.csv2 so better we convert it to "Date" class first.
  3. If "output_folder_2" is the name of the folder it should have a "/" to separate filename and folder name.
  4. Using list.files instead of dir to create file names with full.names = TRUE so that it returns the full path of the file and we don't have to use setwd.
  5. Wouldn't it be better to combine the things which you are doing in Auswertung and Spalten functions in one function and read and write the files once instead of doing it separately in each function.