Spotfire: Calculate timestamp difference between two dates during working days and hours (8-17)

667 Views Asked by At

I have tried to use datediff however it only gives the hour difference between two dates. I would like to receive only the hours that are during the working days and working hours. Example provided below:

Start Date: 19/08/2022 09:42:13

End Date: 22/08/2022 09:54:22

Outcome I receive from Datediff("HH", [Start Date], [End Date]) = 72.20

The actual outcome I would like would be approx. = 8.13

2

There are 2 best solutions below

3
Gaia Paolini On

OK so you mean something like NetWorkHours rather than days. (I don't understand your comment about medians.)

Assuming you have a data table with a column for the start date and a column for the end date (and these are of DateTime data type) you could create a TERR expression that takes the two dates as input and produces a number of worked hours as output.

I have used this answer as a start: How to calculate networkdays minus holidays between 2 dates

but that does not cover hours. So here is my suggested solution. The idea is to initially remove the start and end day (which are incomplete days) and calculate the number of whole days minus weekends as in the previous solution. Then simply multiply it by the working hours in a day. Then take the first and last day and calculate the hours worked. Then add the two together.

Create a TERR expression function (from menu Data > Data Function Properties > Expression Function)

#start and end of working hours
startShift=8 
endShift=17
#fill vector with holiday dates if required. Example:
holidayDates <- c(as.Date('18/04/2022',format='%d/%m/%Y'),as.Date('29/08/2022',format='%d/%m/%Y'))

#count complete days excluding holidays and weekends
allDays = seq.Date(from=as.Date(input1)+1, to=as.Date(input2)-1, by=1)
nonHolidays = as.Date(setdiff(allDays, holidayDates), origin="1970-01-01")
weekends =nonHolidays[weekdays(nonHolidays) %in% c("Saturday", "Sunday")]
nonHolidaysWeekends = as.Date(setdiff(nonHolidays, weekends), origin="1970-01-01")
hoursCompleteDays = length(nonHolidaysWeekends) *(endShift-startShift)

#count worked hours for first and last day
beginTime = as.POSIXlt(input1)
beginHour = beginTime$hour + beginTime$min/60
endTime = as.POSIXlt(input2)
endHour = endTime$hour + endTime$min/60
hoursFirstAndLastDay = (endShift-beginHour)+(endHour-startShift)

#add together
output = hoursCompleteDays + hoursFirstAndLastDay   

call the TERR expression function e.g. TERR_netWorkingHours. This will give you the total hours worked.

Use it by creating a calculated column as:

TERR_netWorkingHours([startDate],[endDate])

where [startDate] and [endDate] are your original columns.

7
Gaia Paolini On

There are two main reasons why my previous answer did not work. Firstly, I tried to adapt an existing answer that worked for full days, but there are too many boundary cases when expanding it to fractional days (e.g. what if my start or end dates are the same, what if they fall over a weekend etc...). Secondly, a TERR expression function expects to work with vectorized inputs, which does not really work with a scenario with so many exceptions on the values of the inputs.

I think it works now (at least for my examples) if I create a TERR data function instead, which outputs a whole new table. I used the R library data.table to make it a bit more efficient. I heavily modified the algorithm to vectorize the steps into a temporary data table (schedule_df). There may be a more clever way, but I did not find it.

You might be able to just output a column by modifying I/O.

Here it is, hope it helps:

suppressWarnings(suppressPackageStartupMessages(library(data.table)))
setDT(dt)

######## main function
netWorkingHours = function(input1, input2) {
  #Helper function
  extractHour = function(x) {
    x = as.POSIXlt(x)
    return (x$hour + x$min/60)
  }
  #prepare ---
  dotimes=FALSE
  #start and end of working hours
  startShift=8 
  endShift=17
  weekend = c('Saturday','Sunday')
  
  #process
  input1d  = as.Date(input1)
  input2d  = as.Date(input2)
  #list all days including start and end
  allDays = seq.Date(from=input1d, to=input2d, by=1)
  Ndays=length(allDays)
  #flag included days: if they are not weekends
  #can be expanded to holidays
  include=ifelse(weekdays(allDays) %in% c('Saturday','Sunday'),0,1)
  
  #start building schedule
  schedule_df=data.table(day=allDays,include=include)
  schedule_df$index=c(1:Ndays)
  #identify boundary days
  schedule_df[,boundary:=ifelse(index==1 | index==Ndays,index,0)]
  
  #initialize working hours
  schedule_df$start=startShift
  schedule_df$end=endShift
  #modify start and end hours for boundary days
  
  schedule_df[boundary==1 & max(boundary)>1, start :=extractHour(input1)]
  schedule_df[boundary==1 & max(boundary)>1, start :=extractHour(input1)]
  schedule_df[boundary==1 & max(boundary)>1, end :=endShift]
  schedule_df[boundary==1 & max(boundary)==1, start :=extractHour(input1)]
  schedule_df[boundary==1 & max(boundary)==1, end :=extractHour(input2)]
  schedule_df[boundary>1 , start :=startShift]
  schedule_df[boundary>1 , end :=extractHour(input2)]
  
  #shift start and end hours by shift limits
  schedule_df[,start1:=sapply(start,function(x) max(startShift,x))]
  schedule_df[,end1  :=sapply(end,function(x) min(endShift,x))]
  
  #calculate worked hours for each day
  schedule_df$worked_hours=0
  schedule_df[include==1,worked_hours:=ifelse(end1>start1,end1-start1,0)]
  Nincluded = nrow(schedule_df[include==1])
  output = ifelse(Nincluded>0,sum(schedule_df[include==1,'worked_hours']),0)
  return (output)
}

######################## main

dt[,workedHours:= mapply(netWorkingHours,dt[['date1']],dt[['date2']])]