I need to get MTD Unique users count each day for last 1 year. Ex. if my data is like

dt  customername
1   a
1   b
2   a
2   c
3   b
3   a
4   c
4   d
4   e

expected output- on 1st only 2 unique user. on 2nd (1&2nd) 3 users. on 3rd (1st to 3rd) 3 users. on 4th (1st to 4th) 5 users. I need this for each month for each date for last 1 year

dt uniquecustcount
1  2
2  3
3  3
4  5

my data is in below format, bu core logic is above one for MTD. Appreciate any help please

CREATE MULTISET TABLE GK_DAILY_USERS ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      EVENT_DATE DATE FORMAT 'yyyy-mm-dd',
      Subs_Id INTEGER,
      PAYMENT_METHOD_CD VARCHAR(8) CHARACTER SET UNICODE NOT CASESPECIFIC,
      Data_User_Flag VARCHAR(1) CHARACTER SET UNICODE NOT CASESPECIFIC,
      BUS_UNIT VARCHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC,
      Event_Count INTEGER)
PRIMARY INDEX ( EVENT_DATE ,Subs_Id );

Insert into GK_DAILY_USERS values (?,?,?,?,?,?);

I am trying below sql. I know this sql doesnt make sense and hence looking for solution.

SELECT EVENT_DATE, MAX(R2) FROM (
SELECT EVENT_DATE 
,COUNT(A.SUBS_ID)OVER(ORDER BY ROWNUM,A.SUBS_ID) AS R2
FROM (
SELECT A.EVENT_DATE,A.SUBS_ID
,Row_Number() Over (Order by EVENT_DATE, SUBS_ID) rownum
,COUNT(A.SUBS_ID)OVER(PARTITION BY A.SUBS_ID ORDER BY A.EVENT_DATE) AS RUN_TOTAL3
FROM  GK_DAILY_USERS A 
WHERE Data_User_Flag='Y' 
AND subs_id in (566875703,289813839, 151153086,279262050,18378517)
ORDER BY EVENT_DATE, SUBS_ID
) A 
WHERE RUN_TOTAL3 = 1
) A GROUP BY EVENT_DATE ORDER BY 1
;
2

There are 2 best solutions below

0
On
SELECT EVENT_DATE 
   -- new customers per date
  ,Sum(SUBS_ID) AS daily_new_cust
  -- running total of new customers per date
  ,Sum(Sum(SUBS_ID))
   Over(ORDER BY EVENT_DATE
        ROWS Unbounded Preceding) AS unique_cust
FROM
 (
   SELECT *
   FROM  GK_DAILY_USERS
   WHERE Data_User_Flag='Y' 
   AND subs_id IN (566875703,289813839, 151153086,279262050,18378517)
   QUALIFY
      -- return a single row with the earliest date for each subs_id
      Row_Number()
      Over (PARTITION BY A.SUBS_ID
            ORDER BY EVENT_DATE) = 1
 ) AS dt
GROUP BY EVENT_DATE 
ORDER BY 1
;
0
On

You can use it on any to-date metrics. YTD, QTD, MTD, etc. I developed a Period-to-Date (PDT) dashboard on Tableau with the table codes I developed on TSQL. One can select any period on the “Period” filter (Year, quarter, month, etc.). Also, it helps me to calculate Churn and Retention for several period comparisons (I want to tell you that the query is really fast on big data). To see it on my GitHub profile: https://github.com/SafaKacar with the name "EXTENTED_MTD-...-3YTD-Retention-with-CUBE-CTE-Window-Functions-LOOP-and-manipulation-techniques"

    SELECT 1 dt, 'a' customername into #ganeshproblem
    INSERT INTO #ganeshproblem VALUES(1,'b')
    INSERT INTO #ganeshproblem VALUES(2,'a')
    INSERT INTO #ganeshproblem VALUES(2,'c')
    INSERT INTO #ganeshproblem VALUES(3,'b')
    INSERT INTO #ganeshproblem VALUES(3,'a')
    INSERT INTO #ganeshproblem VALUES(4,'c')
    INSERT INTO #ganeshproblem VALUES(4,'d')
    INSERT INTO #ganeshproblem VALUES(4,'e')
    
    SELECT DISTINCT a.dt , m.uniquecustcount
    INTO #ToDateTrainerForGanesh
    FROM #ganeshproblem a
    LEFT JOIN 
          (
            SELECT
                 MinDt dt
                ,MAX([Rank]) uniquecustcount
            FROM
                (
                SELECT
                     MinDt
                    ,customername
                    ,ROW_NUMBER() OVER (ORDER BY MinDt) [Rank]
                FROM
                    (
                     SELECT
                            MIN(dt) MinDt
                           ,customername
                     FROM #ganeshproblem
                     GROUP BY customername
                    ) K
                ) L
            GROUP BY MinDt
            ) m  ON m.dt = a.dt
    WHILE (SELECT COUNT(dt) FROM #ToDateTrainerForGanesh WHERE uniquecustcount IS NULL) !=0
    BEGIN
    UPDATE A1
    SET A1.uniquecustcount = A2.uniquecustcount
    FROM #ToDateTrainerForGanesh A1
    JOIN #ToDateTrainerForGanesh A2 ON A1.dt = A2.dt+1
    WHERE a1.uniquecustcount is null and a2.uniquecustcount is not null
    END
select*from #ToDateTrainerForGanesh

I have a stored procedure query on my GitHub https://github.com/SafaKacar with the name "EXTENTED_MTD-...-3YTD-Retention-with-CUBE-CTE-Window-Functions-LOOP-and-manipulation-techniques"

It's a very basic and smart logic. First, you will get the minimum date. Then, You will give a number for each customer with their minimum date. After all, you will take the maximum, grouped by dates. It will provide you with any To-Date metrics. Also, this query is fast. However, It won't end with this flow. You will also need CROSS JOIN or similar (to make a dummy date table). Another need to solve this issue is to use update (to catch the NULL values if the first day is null). Finally, you should use WHILE LOOP to carry the previous log to the next.

To sum up, you will need to use ROW_NUMBER (Or Rank with two ORDER BY elements), CROSS JOIN, UPDATE & WHILE Loop. You can solve this problem by looking around at my GitHub share.

I hope it will be helpful.