I am new to calcite.What i want to do is to provide a simple tool to analyse data from different datasource,such as mysql,rest service , mongo and so on.
I think calcite is just what i need.
But after some trying, I found some problem using calcite.
(1)Full table scan in join query. For example, I have tried a sql query, 'select a.id from csv.tableA a left join mysql.tableB b', that is , the tableA from csv file is joined with tableB from mysql which has millions of records. As you can see, the query causes full table scanning on tableB. How to optimizer this, it is possible to query tableB filtered by a.id ?
Subquery may have the same problem .For example, the sql query, "select a.id from csv.tableA a where a.id<10 and exists ( select 1 from mysql.tableB b b.id=a.id)", also causes full table scanning.
(2) Many IO request to the same datasource. For example,a sql query "select a.id from mysql.tableA a where a.id<10 and exists ( select 1 from mysql.tableB b b.id=a.id)", leads to two io requets in databases.And the second request cases a full table scanning on table B. How to change the excution ?
It will be very nice if you have any idea!
Thansk a lot!