MySQL query to find complete intersections

65 Views Asked by At

Imagine we have two tables:

  • clientOffices that have a list of offices that each client has.
  • clusters are groups of cities
clientOffices
-----------------------------
clientId    |   office
-----------------------------
1               London
1               Manchester
1               Edinburgh
1               Bonn
2               London
2               Frankfurt
2               Bonn
3               Manchester
3               Frankfurt
-----------------------------       



clusters
-----------------------------
clusterName |   city
-----------------------------
X               London      
X               Manchester
Y               Manchester
Y               Frankfurt
Y               Bonn
Z               London
Z               Bonn
-----------------------------

Now we want a query that tells us this: For each client, which what clusters are they FULLY present in? i.e. they have an office in EVERY city in the cluster?

Desired result:
-------------------------------------------
clientOffices.clientId  |   clusters.clusterName
-------------------------------------------
1                               X
1                               Z
2                               Z
3                               Null
--------------------------------------------

I'm a MySQL Noob - tried different types of joins but couldn't get this result. It'd be great if you could help.

fiddle: https://www.db-fiddle.com/f/2GTBVXm9StNHcyuSqSZikW/0

2

There are 2 best solutions below

2
On BEST ANSWER

This will show all the client names that have complete cluster

SELECT DISTINCT a.clientName, a.clusterName
FROM
(
    SELECT clientOffices.clientName, clusters.clusterName, COUNT(*) AS counts
    FROM clientOffices
    LEFT JOIN clusters ON clusters.city = clientOffices.office
    GROUP BY clientOffices.clientName, clusters.clusterName
) AS a
JOIN
(
    SELECT clusterName, COUNT(*) AS counts
    FROM clusters
    GROUP BY clusterName
) AS b
ON a.clusterName = b.clusterName AND a.counts = b.counts

UNION

SELECT c.clientName, null AS clusterName FROM
(
  SELECT  a.clientName, COUNT(*) as invalid_office
FROM
(
    SELECT clientOffices.clientName, clusters.clusterName, COUNT(*) AS counts
    FROM clientOffices
    LEFT JOIN clusters ON clusters.city = clientOffices.office
    GROUP BY clientOffices.clientName, clusters.clusterName
) AS a
JOIN
(
    SELECT clusterName, COUNT(*) AS counts
    FROM clusters
    GROUP BY clusterName
) AS b
ON a.clusterName = b.clusterName AND a.counts != b.counts
  GROUP BY a.clientName
) AS c
JOIN
(
    SELECT clientName, COUNT(*) AS office_count
    FROM clientOffices
    GROUP BY clientName
) AS d
ON c.clientName = d.clientName AND c.invalid_office = d.office_count;

https://www.db-fiddle.com/f/6jKvKXPYvsLeXgm3Qv1nHu/10

2
On

This reads like a relational division problem. I would write this as an aggregation query, with filtering in the having clause:

select co.clientid, cl.clusterName
from clientoffices co
inner join clusters cl on cl.city = co.office
group by co.clientid, cl.clusterName
having count(*) = (select count(*) from clusters cl1 where cl1.clusterName = cl.clusterName)

For your sample data this produces:

| clientid | clusterName |
| -------- | ----------- |
| 1        | X           |
| 1        | Z           |
| 2        | Z           |