What is the proper way to implement JOIN rewrite to allow query to be fed with results of subquery? For example:
SELECT state FROM zips_view WHERE j IN (select j from people_view)
This query gets rewritten to SemiJoin which executes table scan inner query (select agains _people_view_ as expected) then again table scan for outer query (select against _zips_view_). The second scan can be replaced with filtered query e.g.
SELECT state FROM zips_iew WHERE j IN (1,2,3,4)
What's the proper way to implement a "two phase" JOIN which takes results of subquery and adds them to outer query as filter/condition?
I'm using JDBC connector here, and for your desired query:
it generates this relational algebra:
You should start by writing rules that transform your original relational algebra (with SemiJoin) and work your way down to get the relational algebra that looks like above.