SQL average complex query

119 Views Asked by At

This is the scenario I have a table with customer detail and the time when they installed my app,they will find the way back to the table even if they re-installed the app. In the next table I have the purchase time for the same customers.

users
uid Version install_time
1   1   2013-06-01 00:00:00
1   2   2014-06-01 00:00:00  
1   3   2014-10-01 00:00:00
2   3   2014-11-11 00:00:00
3   2   2013-11-11 00:00:00
4   4   2015-01-01 00:00:00

trans
uid transaction_time
1   2013-07-01 00:00:00
1   2014-07-01 00:00:00
1   2014-11-01 00:00:00
2   2014-12-11 00:00:00
999 2014-11-04 00:13:49

Q: On average, how many days it takes for a customer to make the 1st purchase?

This is what I have attempted so far :

select avg(`purchase after install`) as average
from 
(
select 
u.uid,
dayofyear(t.transaction_time)-dayofyear(u.install_time) AS `purchase after install`
from users u 
left join trans t -- joining the transaction time to user table 
on u.uid=t.uid
where t.transaction_time >= u.install_time -- because the cartesian product from the join is creating additional rows for uid 1
-- group by 1
) final

I am getting 65 days , but if you notice the table the average should come as 30 days as I have spaced out the purchase exactly by 30 days.

2

There are 2 best solutions below

0
On
 DROP TABLE IF EXISTS users;

 CREATE TABLE users
 (uid INT NOT NULL
 ,Version INT NOT NULL
 ,install_time DATETIME NOT NULL
 ,PRIMARY KEY(uid,Version,install_time)
 );

 INSERT INTO users VALUES
 (1   ,1   ,'2013-06-01 00:00:00'),
 (1   ,2   ,'2014-06-01 00:00:00'),
 (1   ,3   ,'2014-10-01 00:00:00'),
 (2   ,3   ,'2014-11-11 00:00:00'),
 (3   ,2   ,'2013-11-11 00:00:00'),
 (4   ,4   ,'2015-01-01 00:00:00');

 DROP TABLE IF EXISTS trans;

 CREATE TABLE trans
 (uid INT NOT NULL
 ,transaction_time DATETIME NOT NULL
 ,PRIMARY KEY(uid,transaction_time)
 );

 INSERT INTO trans VALUES
 (1   ,'2013-07-01 00:00:00'),
 (1   ,'2014-07-01 00:00:00'),
 (1   ,'2014-11-01 00:00:00'),
 (2   ,'2014-12-11 00:00:00'),
 (999 ,'2014-11-04 00:13:49');

 SELECT u.*
      , MIN(t.transaction_time) min_t
      , DATEDIFF(MIN(t.transaction_time),u.install_time) diff 
   FROM users u 
   JOIN trans t 
     ON t.uid = u.uid 
    AND t.transaction_time >= u.install_time 
  GROUP 
     BY u.uid
      , u.version
      , u.install_time;
 +-----+---------+---------------------+---------------------+------+
 | uid | Version | install_time        | min_t               | diff |
 +-----+---------+---------------------+---------------------+------+
 |   1 |       1 | 2013-06-01 00:00:00 | 2013-07-01 00:00:00 |   30 |
 |   1 |       2 | 2014-06-01 00:00:00 | 2014-07-01 00:00:00 |   30 |
 |   1 |       3 | 2014-10-01 00:00:00 | 2014-11-01 00:00:00 |   31 |
 |   2 |       3 | 2014-11-11 00:00:00 | 2014-12-11 00:00:00 |   30 |
 +-----+---------+---------------------+---------------------+------+

I'll leave the final piece of the puzzle as an exercise for the reader.

1
On

try like

select AVG(datediff(day,a.date1,b.date2)) from  table1 as a inner join table2 as b on a.id=b.id where a.date1>=b.date2