How to query the longest streak of dates from today to the past on all tables in a db

168 Views Asked by At

The problem:

I'm trying to create a viz on redash using the summary of historical data I have on a mysql db. I found it heavily dependent on SQL and I found the python support not so easy to manage.

It happens that I have a few tables in a db that report some statistical data (all float numbers) every day.

I want to query on each table the longest streak of historical data always counting from today, it should demand only raw (my)SQL queries, no procedures =/

Consider the following sample data set...

today is 2020-10-21

DROP TABLE IF EXISTS table1;

CREATE TABLE table1
(date DATE NOT NULL PRIMARY KEY);

INSERT INTO table1 VALUES
('2020-10-17'),
('2020-10-19'),
('2020-10-20'),
('2020-10-21');


DROP TABLE IF EXISTS table2;

CREATE TABLE table2
(date DATE NOT NULL PRIMARY KEY);

INSERT INTO table2 VALUES
('2020-10-18'),
('2020-10-19'),
('2020-10-20'),
('2020-10-21');

I want to accomplish this:

mydata.tables longest_streak
table1        3
table2        4

EDIT: I should add that I was able to extract the tables list from the information schema where I should run this with probably a "UNION" command, but it's still hard for me to use this using only a raw query with probably a few subselects.

EDIT2: I was able to query like that

    SELECT 
        CONCAT('SELECT DISTINCT ''', table_name, ''', dateday  FROM ', 'mydata.', table_name) AS query_string
    FROM
        information_schema.tables
    WHERE
        TABLE_SCHEMA like 'mydata' and TABLE_NAME like 'table%'

which would give the general query as a string to what I roughly need to do, the problem is that Redash relies heavily on a raw SQL query. Maybe I could add a procedure to the db where Redash would call it, but I think for this db it's a no-go (due to policies)

1

There are 1 best solutions below

0
On

Here's a method for versions of MySQL prior to 8.0...

SELECT tab
     , MAX(i) i 
  FROM 
     ( SELECT tab
            , date
            , CASE WHEN @prev_tab = tab THEN CASE WHEN @prev_date >= date - INTERVAL 1 DAY THEN @i:=@i+1 ELSE @i:=1 END ELSE @i:=1 END i
            , @prev_tab := tab
            , @prev_date := date 
         FROM 
            ( SELECT 1 tab, date FROM table1
               UNION
              SELECT 2    , date FROM table2
             -- repeat as necessary or, better, fix your schema
               ORDER 
                  BY tab
                   , date
            ) x
         JOIN
            ( SELECT @prev_tab:=null, @prev_date:=null,@i:=0) vars
     ) a
 GROUP
    BY tab;

+-----+------+
| tab | i    |
+-----+------+
|   1 |    3 |
|   2 |    4 |
+-----+------+