Aggregating join results that I also use in the where clause with MySQL

96 Views Asked by At

I'm referencing two tables in the simple machines forum web forum software: smf_members, and smf_members. each row in smf_members has a field: id_group, of which I am interested in values: 1,9,20,26,23,27 (using the IN() clause). The general goal is to determine which rows in the smf_members table that are in the above group id's haven't had a row entered in smf_messages in 90 days. poster_time in smf_messages is a unix timestamp. What I have so far is:

SELECT
  m.id_member,
  m.id_group,
  from_unixtime(max(ms.poster_time))
FROM
  smf_members m
  LEFT JOIN smf_messages ms
    USING(id_member)
WHERE
  max(ms.poster_time) < (NOW() < (86400 * 90)
GROUP BY
  m.id_member

It fails with and ERROR 1111: Invalid use of the group function, since I am using max() in the where clause. How can I aggregate my join results to only reference the latest entry based off the poster_time field?

2

There are 2 best solutions below

0
On BEST ANSWER

There are several ways to this.

  1. SELECT * FROM members WHERE id_member IN ( ... ) AND NOT EXISTS ( SELECT 1 FROM messages WHERE is_member = members.id_member AND poster_time > (UNIX_TIMESTAMP*() - 90 * 24 * 60 * 60)

  2. look for the cases where an outer join finds nothing SELECT DISTINCT id_member FROM MEMBERS AS m ... LEFT JOIN messages AS msg USING .... WHERE msg.id IS NULL

  3. SELECT * FROM members WHERE id_member IN ( .... ) AND id_member NOT IN (SELECT id_member FROM messages WHERE poster_time > (UNIX_TIMESTAMP*() - 90 * 24 * 60 * 60) )

0
On

You may want to try using a temporary table to figure out the max(ms.poster_time) before attempting the GROUP BY. Something like this ought to work.

SELECT * FROM 
(SELECT
  m.id_member,
  m.id_group,
  from_unixtime(max(ms.poster_time))
FROM
  smf_members m
  LEFT JOIN smf_messages ms
    USING(id_member)
WHERE
  max(ms.poster_time) < (NOW() < (86400 * 90) ) AS tmp

GROUP BY
  m.id_member

Doing it this way allows you to use the aggregate function with the GROUP BY. It won't perform very well if you have large datasets. If that's the case, you may want to consider regularly summarizing the data generated by the inner table, and then querying on that.