I have a table containing three columns: city ID, language code and city:
city_id | language_code | city
----------------------------------
1 | en | London
1 | es | Londres
1 | pt | Londres
2 | de | Köln
2 | en | Cologne
3 | it | Cologne
In some foreign languages cities can be spelled the same way, e.g. the Spanish and Portuguese name for London is Londres.
But there are cases where the same name can refer to completely different locations, e.g. Cologne is an English name for the German city of Köln but there's also a town of the same name in Italy.
I would like to be able to detect cities that have more than one entry in the table but only those that are linked to different city_id values. In my case this would be Cologne but not Londres as both es and pt language versions point to the same city_id.
I thought this would be a fairly easy thing to do but I haven't been able to get the results in a single query. Instead, I am deduping the results first and then aggregating them:
WITH deduped_cities AS (
SELECT DISTINCT city, city_id
FROM cities
ORDER BY city
)
SELECT city, COUNT(city_id) AS total
FROM deduped_cities
GROUP BY city
HAVING COUNT(city_id) > 1;
This gives me the expected result:
city | total
----------------
Cologne | 2
I was just wondering if it is possible to achieve the same effect with a single SELECT statement.
you have basically just one
SELECT, but you can useDISTINCT' with theCOUNTSee fiddle