Including for loop in subsets

81 Views Asked by At

I want to calculate the weekly averages of the the load for two ships with containers. One ship sails at Sunday and the other on Wednesday. I have a big excel file with bookings.I will load up a small part of this file in the following link: https://docs.google.com/spreadsheets/d/1BxHTClTkrQzIzZzG5vXXnvKtV0_az83PGJ2ghBaAQr0/edit?usp=sharing

The first ship gets the containers that should be delivered on Monday (Mo), Tuesday(Di) and Wednesday (Mi). The second ship should deliver the containers demanded in the other port for Thursday (Do), Friday (Fr), Saturday(Sa) and Sunday(So). The data contains information about the containers from 2017-01-01 till 2018-07-31. These are 82 full weeks. I would like to make a vector with length 82, with each number the amount of containers of the days combined for that week. For example, the first number of the vector should be the demand of containers for Monday, Tuesday and Wednesday in the first week. So, I want to create a vector, one per ship, that contains information about the amount of containers that should be loaded on this ship. A vector of 82 weeks, to see which weeks we had low demand and the mean etc.

Can anyone please help me?

Here is the beginning of my code:

 containers <- "https://docs.google.com/spreadsheets/d/1BxHTClTkrQzIzZzG5vXXnvKtV0_az83PGJ2ghBaAQr0/edit?usp=sharing"

#Containers between Rotterdam and Duisburg
containersRTMDUI <- subset(containers, containers$Laadhaven == "Rotterdam" & containers$Loshaven == "Duisburg")

#I used to do this in subsets, because I could not make a loop
Week1 <- subset(containersRTMDUI, containersRTMDUI$Datum1 >= "2017-01-02" & 
containersRTMDUI$Datum1 < "2017-01-09" & containersRTMDUI$Dag1 = "Mo" & 
containersRTMDUI$Dag1 = "Di" &containersRTMDUI$Dag1 = "Mi")
Week2 <- subset(etc..)

Of course, the hard point comes by the fact that for some days there is no demand.

1

There are 1 best solutions below

4
On

I think I got it. One approach with data.table:

# read in data as a data.table
    library(data.table)
    dt <- data.table(read.csv("path/to/file", stringsAsFactors = F))

# rename variables to english (
# there are shorter ways to do this, but I like to keep track)
    setnames(dt, old = "ISO",          new = "containter_type")
    setnames(dt, old = "F.E",          new = "full_empty")
    setnames(dt, old = "Gewicht",      new = "weight")
    setnames(dt, old = "Laadhaven",    new = "pickup_port")
    setnames(dt, old = "Laadterminal", new = "pickup_terminal")
    setnames(dt, old = "Loshaven",     new = "dropoff_port")
    setnames(dt, old = "Losterminal",  new = "dropoff_terminal")
    setnames(dt, old = "Datum1",       new = "pickup_date")
    setnames(dt, old = "Dag1",         new = "pickup_dow")
    setnames(dt, old = "Datum2",       new = "dropoff_date")
    setnames(dt, old = "Dag2",         new = "dropoff_dow")

# convert date variable to date-type (instead of factor/string)
    dt[ , pickup_date  := as.Date(pickup_date,  "%d.%m.%Y")]
    dt[ , dropoff_date := as.Date(dropoff_date, "%d.%m.%Y")]

# create a week variable
    dt[ , week := lubridate::week(pickup_date)]

# create a variable (MTW) by day-of-week
# MTW=1 for mon, tues, wed; MTW=0 for thurs, fri, sat, sun
    dt[ , MTW := pickup_dow %in% c("Mo", "Di", "Mi")]

# count the number of rows by week and MTW
    result <- dt[ , .(nrows = .N), by=.(week, MTW)]

# print result
    result

# fill in 0 weeks
    dt2 <- data.table(week = rep(1:7, each=2), MTW = rep(c(T,F), each=7))
    result <- merge(result, dt2, by=c("week", "MTW"), all=T)
    result[is.na(nrows), nrows := 0]

# print updated result
    result