Performance issue with a really big IN (ids) statement

108 Views Asked by At

I have this statement:

select qulified_name
from table
inner join references_table on references_table.id = table.ref_id
where references_table.type = 'x' and table.value in (... +110 000 ids)

which is extremly slow. (The web app crashes and doesn't get a result. I create my statement with the help of rom-rb and this uses sequel. But this is the statement I get when I take a look at the statement.

If I rewrite the satement like this the performance is really good compared to the first version:

select qulified_name
from table
inner join (select unnest(array['#{references_id.join("','")}']) id ) as tmp on tmp.id = businesspartner_references.value
inner join references_table on references_table.id = table.ref_id
where references_table.type = 'x'

This way I get the result in ~3 sec.

Can someone explain to me why this is the case? I don't understand it..

1

There are 1 best solutions below

0
On BEST ANSWER

When you use an IN clause, especially with lots of values, the database has no alternative but to compare every tuple value with each value in your IN clause iteratively, which will be inefficient.

Instead, when you turn it into a derived table using your subquery, it now turns into a join operation which is set-oriented, in contrast.

Databases are very good at evaluating set-oriented operations, and can find the best join algorithm for your data.