I have a table that references a bunch of articles, the table contains tags for those articles. Like this:
tag text
article_id bigint
I want to select all article_ids with a set of tags, say tag1, tag2, tag3 but the article could also have tag4, tag5 as well attached to it.
I know this will work:
SELECT article_id
FROM tag WHERE tag='tag1'
INTERSECT
SELECT article_id
FROM tag
WHERE tag='tag2'
INTERSECT
SELECT article_id
FROM tag
WHERE tag='tag3'
And so will this:
SELECT article_id
FROM tag
WHERE tag IN ('tag1','tag2','tag3')
GROUP BY article_id
HAVING count(*) = 3
But I am not sure that that is the most efficient way to do this. I have also been playing with the below, but cannot get it to work currently.
SELECT array_agg(tag) as arr,
article_id
FROM tag
GROUP BY article_id
HAVING arr = {tag1,tag2,tag3}
This just seemed like a common issue that others would encounter, I was wondering if the INTERSECT is the most efficient query in this case. It is for PostgreSQL.
Personally I like the second option. But you should use PostgreSQL's tools, along with some test queries, to see which is the most efficient.
The reason the third doesn't work as written is because you need specify a sort order for array_agg(): how to make array_agg() work like group_concat() from mySQL