Display mean / median / max instead of no. of occurrences

55 Views Asked by At

I am using R and the bupar package to do process analysis. Suppose my data stored in a csv file looks like this:

STATUS;timestamp;CASEID
created;16-02-2023 09:46:32;1
accepted;13-04-2023 23:59:59;1
created;16-02-2023 09:46:32;2
accepted;13-04-2023 23:59:59;2
created;14-12-2022 13:17:54;3
accepted;02-01-2023 23:59:59;3
created;28-02-2023 19:37:01;4
accepted;03-03-2023 23:59:59;4
created;02-01-2023 07:45:43;5
created;24-01-2022 16:05:58;6
accepted;03-02-2022 23:59:59;6
created;24-01-2022 15:52:53;7
accepted;03-02-2022 23:59:59;7
created;15-08-2022 12:54:23;8
rejected;18-08-2022 23:59:59;8
created;21-03-2022 15:32:05;9
accepted;26-04-2022 23:59:59;9
created;21-03-2022 15:42:39;10

Now when I run the following code I get the process map:

library(bupaR)
library(processmapR)
library(edeaR)

datafile <- read.csv(file="pathtofile\\testfile.csv",header=T, sep=";")
datafile$timestampcolumn <- as.POSIXct(datafile$timestamp, format="%d-%m-%Y %H:%M:%S")

print(datafile)
mytest <- simple_eventlog(datafile, case_id = "CASEID", activity_id = "STATUS", timestamp = "timestampcolumn")

process_map(mytest, type = frequency("absolute"))

processmap and also the matrix:

mytest %>%
  precedence_matrix(type = "absolute") %>%
  plot

processmatrix (I don't know why 9 is displayed for start created, it should be 10)

Now, I would like to have for example the mean displayed on the traces. The following output shows the desired process map:

meanprocessmap and matrix:

meanprocessmatrix

I tried the following code (according to this post):

mytest %>% 
  process_map(type_nodes = frequency(value = "absolute_case"), type_edges = performance(FUN = mean, units = "days")) %>%
  plot

or (according to this post)

mytest %>% 
  process_map(performance(mean, "days"), 
                         type_nodes = performance(median, "days"), 
                         sec_nodes = frequency("relative"),
                         type_edges = performance(median, "days"), 
                         sec_edges = frequency("relative")) %>%
  plot

But I get an error message:

Error in xy.coords(x, y, xlabel, ylabel, log)

So what is the correct code for this? I need mean, median and maximum.

1

There are 1 best solutions below

0
Gert Janssenswillen On

This piece of code is correct, you just have to drop the "plot()" part

mytest %>% 
  process_map(performance(mean, "days"), 
                         type_nodes = performance(median, "days"), 
                         sec_nodes = frequency("relative"),
                         type_edges = performance(median, "days"), 
                         sec_edges = frequency("relative"))

This code shows the performance median on both flows and nodes as primary data. As secondary label it shows the relative frequency.

Note that, as there is no difference between edges and nodes, you can do this also shorter.

mytest %>% 
  process_map(type = performance(median, "days"), 
              sec = frequency("relative"))