I have below-mentioned dataframe in R:
dput(df)
structure(list(CustID = c("C-1", "C-2",
"C-2", "C-2", "C-3", "C-3",
"C-3", "C-4", "C-5"), DATE = c("2021-01-02 14:13:10", "2021-01-02 13:17:07", "2021-01-02 14:15:10", "2021-01-02 16:14:08", "2021-01-02 17:11:03", "2021-01-02 12:14:24", "2021-01-02 12:33:34", "2021-01-02 10:43:55", "2021-01-03 20:23:35"), TYPE = c("Demo",
"Pro", "Pro", "Pro", "Pro", "Pro", "Pro", "Pro", "Pro"
), LogCat = c(NA, "SPR,DET,RTD", "SPR,DET,RTD", "SPR,DET,RTD",
"DET", "DET", "DET", NA, " SPR, RTD "), PriceCode = c(NA,"KR", "SR", "DE", "KL", "ZT", "KR", "KR", "KR")), class = "data.frame", row.names = c(NA,
-9L))
CusID Date Type LogCat PriceCode
C-1 2021-01-02 14:13:10 Demo
C-2 2021-01-02 13:17:07 Pro SPR, DET, RTD KR
C-2 2021-01-02 14:15:10 Pro SPR, DET, RTD SR
C-2 2021-01-02 16:14:08 Pro SPR, DET, RTD DE
C-3 2021-01-02 17:11:03 Pro DET KL
C-3 2021-01-02 12:14:24 Pro DET ZT
C-3 2021-01-02 12:33:34 Pro DET KR
C-4 2021-01-02 10:43:55 Pro KR
C-5 2021-01-03 20:23:35 Pro SPR, RTD KR
I'm using the following code which give me below-mentioned output.
library(tidyverse)
library(janitor)
df %>% mutate(DATE = as.Date(DATE)) %>% select(1:3) %>%
unique() %>%
tabyl(TYPE, DATE) %>%
adorn_totals("row") %>%
adorn_percentages("col") %>%
adorn_pct_formatting(2) %>%
adorn_ns("front")
TYPE 2021-01-02 2021-01-03
Demo 1 (25.00%) 0 (0.00%)
Pro 3 (75.00%) 1 (100.00%)
Total 4 (100.00%) 1 (100.00%)
In the above mentioned output, I need the distinct sub count based on column LogCat and PriceCode. For Example, in the give total distinct count for particular Type I need to get the number how many distinct CusID has LogCat and PriceCode. Also, how many distinct count doesn't have LogCat and PriceCode.
Where, For each type we need to check the following condition.
LogCat- Where an uniqueCusIDhas only value inLogCatcolum andPriceCodeis either NA or BlankPriceCode- Where an uniqueCusIDhas only value inPriceCodecolum andLogCatis either NA or BlankBoth True- Where an uniqueCusIDhas value in bothLogCatandPriceCodecolumn.Both False- Where an uniqueCusIDhas no value in bothLogCatandPriceCodecolumn.
The percentage for these values should be calculated by taking the distinct count of a particular Type for which we are calculating these values. For Example, below Demo if we are calculating these value for Demo type then the denominator should be 1 for first date and 0 for the second date.
Required Dataframe<-
TYPE 2021-01-02 2021-01-03
Demo 1 (25.00%) 0 (0.00%)
LogCat 0 (0.00%) 0 (0.00%)
PriceCode 0 (0.00%) 0 (0.00%)
Both True 0 (0.00%) 0 (0.00%)
Both False 1 (100.00%) 0 (0.00%)
Pro 3 (75.00%) 1 (100.00%)
LogCat 0 (0.00%) 0 (0.00%)
PriceCode 1 (33.33%) 0 (0.00%)
Both True 2 (66.66%) 1 (100.00%)
Both False 0 (0.00%) 0 (0.00%)
Total 4 (100.00%) 1 (100.00%)

Here is what I understanding from your OP - quite complicated logics ;)
Created on 2021-04-18 by the reprex package (v2.0.0)
And here is a way to generate some pivot style table in RStudio
This will output in the
Viewtab inRStudiothis table