Oracle Avoid Wasteful Join Back?

179 Views Asked by At

Suppose we have three tables (A, B, and C) as defined in the contrived example below where A and B are related to (have foreign keys in) C. Suppose, we want values from all three tables and predicate on A and B. Oracle can only join two rowsets together at any one time. We get a join order something like ((A -> C) -> B). This means we spend I/O getting rows from C, that we end up just discarded when we join back to B (and B's predicate).

How can we avoid this "wasteful" I/O on table C?

Star Transformation is great, but only kicks in if the optimizer determines the cost justifies the star transformation. That is, we are not guaranteed to get a star transformation. This might seem like what one wants, but the optimizer is obtaining poor estimated rows (see example below - off by a factor of 10). Thus, the optimize is electing to not use star transformation when it otherwise would prove beneficial.

We cannot manually author the query in star transformation like from as the SQL is generated by a BI reporting tool.

Perhaps my question is how to "force" the optimizer to use star transformation without manually writing the query in that form? Or, perhaps, my question is how to get the estimated rows to be better, so we can be more assured that the optimizer will invoke star transformation? Or, perhaps (quite likely), there is some other cool Oracle feature of which I am not yet aware that might provide a solution.

Oracle 12.1 Enterprise Edition (but moving to 19.1 in a few months) Thanks in advance.

drop table join_back_c;
drop table join_back_a;
drop table join_back_b;

create table join_back_a
  as
  with "D" as (select 1 from dual connect by level <= 1000)
    select rownum                  a_code
           , rpad('x',100)         a_name
      from "D"
;

create unique index IX_join_back_a_code on join_back_a(a_code); 
alter table join_back_a add constraint PK_dan_join_back_a primary key (a_code);

create table join_back_b
  as
  with "D" as (select /*+ materialize */ 1 from dual connect by level <= 320)
    select  rownum                b_id
          , mod(rownum, 10)       b_group
    from "D", "D"
   where rownum <= 100000 --100k
;

create unique index IX_join_back_b_id on join_back_b(b_id);   
create index IX_join_back_b_group on join_back_b(b_group); 
alter table join_back_b add constraint PK_dan_join_back_b primary key (b_id);

create table join_back_c
  as
  with "D" as (select /*+ materialize */ level from dual connect by level <= 3200)
    select  rownum                              c_id
          , trunc(dbms_random.value(1, 1000))   a_code     --table a FK
          , trunc(dbms_random.value(1, 100000)) b_id       --table b FK
    from "D", "D"
   where rownum <= 1000000 -- 1M
;

create index IR_join_back_c_a_code on join_back_c(a_code);
create index IR_join_back_c_b_id on join_back_c(b_id);  

exec dbms_stats.gather_table_stats('DATA','JOIN_BACK_C');
exec dbms_stats.gather_table_stats('DATA','JOIN_BACK_A');
exec dbms_stats.gather_table_stats('DATA','JOIN_BACK_B');
select *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1
;
--------------------------------------------------------------------------------------------------------
| id  | Operation                      | name                  | rows  | Bytes | cost (%CPU)| time     |
--------------------------------------------------------------------------------------------------------
|   0 | select statement               |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|*  1 |  hash join                     |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|   2 |   nested LOOPS                 |                       |       |       |            |          |
|   3 |    nested LOOPS                |                       |  1001 |   116K|   839   (1)| 00:00:01 |
|   4 |     table access by index ROWID| JOIN_BACK_A           |     1 |   105 |     2   (0)| 00:00:01 |
|*  5 |      index range scan          | IX_JOIN_BACK_A_CODE   |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     index range scan           | IR_JOIN_BACK_C_A_CODE |  1001 |       |     4   (0)| 00:00:01 |
|   7 |    table access by index ROWID | JOIN_BACK_C           |  1001 | 14014 |   837   (1)| 00:00:01 |
|*  8 |   table access full            | JOIN_BACK_B           | 10000 | 80000 |   143   (5)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."B_ID"="C"."B_ID")
   5 - access("A"."A_CODE"=1)
   6 - access("C"."A_CODE"=1)
   8 - filter("B"."B_GROUP"=1)
select count(*) 
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1
;  -- about 100 rows

Join Order: ((A -> C) -> B)

A -> C (step 3) has an accurate estimated rows of about 1k.

step 8's estimate is accurate too.

However, this join back to B (step 1) will only serve to further reduce the 1k rowset from step 3. In this case B's predicate reduces the (A -> C) rowset by a factor of 1/10.
This means we accessed 1000 rows from C, just to throw away 900 of those rows.

select /*+ star_transformation */
       *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1
;
--------------------------------------------------------------------------------------------------------
| id  | Operation                      | name                  | rows  | Bytes | cost (%CPU)| time     |
--------------------------------------------------------------------------------------------------------
|   0 | select statement               |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|*  1 |  hash join                     |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|   2 |   nested LOOPS                 |                       |       |       |            |          |
|   3 |    nested LOOPS                |                       |  1001 |   116K|   839   (1)| 00:00:01 |
|   4 |     table access by index ROWID| JOIN_BACK_A           |     1 |   105 |     2   (0)| 00:00:01 |
|*  5 |      index range scan          | IX_JOIN_BACK_A_CODE   |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     index range scan           | IR_JOIN_BACK_C_A_CODE |  1001 |       |     4   (0)| 00:00:01 |
|   7 |    table access by index ROWID | JOIN_BACK_C           |  1001 | 14014 |   837   (1)| 00:00:01 |
|*  8 |   table access full            | JOIN_BACK_B           | 10000 | 80000 |   143   (5)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."B_ID"="C"."B_ID")
   5 - access("A"."A_CODE"=1)
   6 - access("C"."A_CODE"=1)
   8 - filter("B"."B_GROUP"=1)

I'm looking for an execution path that is something more like the following. Despite the 10M estimated rows below, the row count of this query remains around 100. However, we cannot control the SQL generated to this degree. This is what was referred to above as a manually writing the query in star transformation like from.

select *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where C.rowid in ( select C1.rowid 
                      from join_back_C "C1"
                           join join_back_a "A1"
                                on C1.a_code = A1.a_code
                     where A1.a_code = 1
                    intersect
                    select C2.rowid 
                      from join_back_C "C2"
                           join join_back_b "B1"
                                on C2.b_id = B1.b_id
                     where B1.b_group = 1                  
                  )
;
---------------------------------------------------------------------------------------------------------------
| id  | Operation                     | name                  | rows  | Bytes |TempSpc| cost (%CPU)| time     |
---------------------------------------------------------------------------------------------------------------
|   0 | select statement              |                       |  9928K|  1316M|       |  4649  (17)| 00:00:01 |
|*  1 |  hash join                    |                       |  9928K|  1316M|       |  4649  (17)| 00:00:01 |
|   2 |   table access full           | JOIN_BACK_A           |  1000 |   102K|       |    16   (0)| 00:00:01 |
|*  3 |   hash join                   |                       |  9928K|   321M|       |  4320  (11)| 00:00:01 |
|   4 |    table access full          | JOIN_BACK_B           |   100K|   781K|       |   142   (5)| 00:00:01 |
|   5 |    nested LOOPS               |                       |    10M|   248M|       |  3858   (3)| 00:00:01 |
|   6 |     view                      | VW_NSO_1              |  1001 | 12012 |       |  2855   (4)| 00:00:01 |
|   7 |      INTERSECTION             |                       |       |       |       |            |          |
|   8 |       SORT UNIQUE             |                       |  1001 | 18018 |       |            |          |
|   9 |        NESTED LOOPS           |                       |  1001 | 18018 |       |     5   (0)| 00:00:01 |
|* 10 |         INDEX RANGE SCAN      | IX_JOIN_BACK_A_CODE   |     1 |     4 |       |     1   (0)| 00:00:01 |
|* 11 |         INDEX RANGE SCAN      | IR_JOIN_BACK_C_A_CODE |  1001 | 14014 |       |     4   (0)| 00:00:01 |
|  12 |       SORT UNIQUE             |                       | 99191 |  2131K|  3120K|            |          |
|* 13 |        HASH JOIN              |                       | 99191 |  2131K|       |  1789   (5)| 00:00:01 |
|* 14 |         TABLE ACCESS FULL     | JOIN_BACK_B           | 10000 | 80000 |       |   143   (5)| 00:00:01 |
|  15 |         INDEX FAST FULL SCAN  | IR_JOIN_BACK_C_B_ID   |  1000K|    13M|       |  1614   (3)| 00:00:01 |
|  16 |     TABLE ACCESS BY USER ROWID| JOIN_BACK_C           | 10000 |   136K|       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."A_CODE"="C"."A_CODE")
   3 - access("B"."B_ID"="C"."B_ID")
  10 - access("A1"."A_CODE"=1)
  11 - access("C1"."A_CODE"=1)
  13 - access("C2"."B_ID"="B1"."B_ID")
  14 - filter("B1"."B_GROUP"=1)

Tried making the two foreign key indices on table C into bitmap indices - no luck. Also, tried a composite index on table C(a_code, b_id) - again, no luck. Also, the composite index is not preferable as our table C really has many foreign keys (some surrogates and some natural keys).

2

There are 2 best solutions below

0
On BEST ANSWER

In addition to Jon Heller's answer:

On the other hand, the predicate b.b_group = 1 will select 10% of the rows, which is in full table scan territory, and is not an operation you want to run repeatedly in a subquery.

mod(rownum, 10) b_group not only gives 10% selectivity, but also means that in your test case each table block contains a few dozens of such rows:

SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from join_back_b;

COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
---------------------------------------------------
                                                177
SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from join_back_b where b_group=1;

COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
---------------------------------------------------
                                                177

SQL> select min(cnt), max(cnt), avg(cnt)
  2  from (
  3    select dbms_rowid.rowid_block_number(rowid) block_n, count(*) cnt
  4    from join_back_b
  5    where b_group=1
  6    group by dbms_rowid.rowid_block_number(rowid)
  7  );

  MIN(CNT)   MAX(CNT)   AVG(CNT)
---------- ---------- ----------
        49         62 56.4971751

It gives us 10000 rows from B and b.b_id=c.b_id predicate gives us ~10% selectivity from JOIN_BACK_C and also means that each block of JOIN_BACK_C contains few dozens of needed rows:

SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from join_back_c;

COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
---------------------------------------------------
                                               2597

select min(cnt), max(cnt), avg(cnt), count(distinct block_n)
from (
  select dbms_rowid.rowid_block_number(c.rowid) block_n, count(*) cnt 
  from join_back_b b
       join join_back_c c
       on b.b_id=c.b_id
  where b_group=1 
  group by dbms_rowid.rowid_block_number(c.rowid)
);

  MIN(CNT)   MAX(CNT)   AVG(CNT) COUNT(DISTINCTBLOCK_N)
---------- ---------- ---------- ----------------------
         8         57 38.6334232                   2597

Moreover join_back_c.a_code=1 also gives bad selectivity ~ 1/1000 = 1000 rows in random blocks, while this table contains just ~2500 blocks. So you need to scan 1/2.5 =~ 40% of table blocks. Obviously, it's better to do it using multiblock reads.

But if we return to the main problem: yes, I understand your problem - sometimes it's better to split one row source to 2 different access paths and CBO often can't do this. There is a standard approach for such cases - to rewrite the query and duplicate row source twice, for example:

a bit modified test data for better selectivity/reduced IO:

create table join_back_b
  as
  with "D" as (select /*+ materialize */ 1 from dual connect by level <= 320)
    select  rownum                b_id
          , mod(rownum, 1000)     b_group
    from "D", "D"
   where rownum <= 100000 --100k
   order by b_group
;

and +padding (to make rows bigger):

create table join_back_c
  as
  with "D" as (select /*+ materialize */ level from dual connect by level <= 3200)
    select  rownum                              c_id
          , trunc(dbms_random.value(1, 1000))   a_code     --table a FK
          , trunc(dbms_random.value(1, 100000)) b_id       --table b FK
          , rpad('x',100,'x') padding
    from "D", "D"
   where rownum <= 1000000 -- 1M
;

Example:

with
 ac as (
  select c.rowid rid
        ,a.*
  from join_back_a A
       join join_back_c C
         on A.a_code = C.a_code
  where a.a_code = 1
 )
,bc as (
  select c.rowid rid
        ,b.*
  from join_back_b B
       join join_back_c C
         on b.b_id = c.b_id
  where b.b_group = 1
)
select--+ no_adaptive_plan NO_ELIMINATE_JOIN(c) no_merge(ac) no_merge(bc) 
   *
  from ac 
       join bc on ac.rid=bc.rid
       join join_back_c C
         on bc.rid = c.rowid;

Plan:

Plan hash value: 3065703407

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                       |     1 |   230 |   209   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                           |                       |     1 |   230 |   209   (0)| 00:00:01 |
|*  2 |   HASH JOIN                             |                       |     1 |   115 |   208   (0)| 00:00:01 |
|   3 |    VIEW                                 |                       |   992 | 37696 |   202   (0)| 00:00:01 |
|   4 |     NESTED LOOPS                        |                       |   992 | 25792 |   202   (0)| 00:00:01 |
|   5 |      TABLE ACCESS BY INDEX ROWID BATCHED| JOIN_BACK_B           |   100 |   900 |     2   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN                  | IX_JOIN_BACK_B_GROUP  |   100 |       |     1   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN                   | IR_JOIN_BACK_C_B_ID   |    10 |   170 |     2   (0)| 00:00:01 |
|   8 |    VIEW                                 |                       |  1001 | 77077 |     6   (0)| 00:00:01 |
|   9 |     NESTED LOOPS                        |                       |  1001 |   118K|     6   (0)| 00:00:01 |
|  10 |      TABLE ACCESS BY INDEX ROWID        | JOIN_BACK_A           |     1 |   105 |     2   (0)| 00:00:01 |
|* 11 |       INDEX UNIQUE SCAN                 | IX_JOIN_BACK_A_CODE   |     1 |       |     1   (0)| 00:00:01 |
|* 12 |      INDEX RANGE SCAN                   | IR_JOIN_BACK_C_A_CODE |  1001 | 16016 |     4   (0)| 00:00:01 |
|  13 |   TABLE ACCESS BY USER ROWID            | JOIN_BACK_C           |     1 |   115 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("AC"."RID"="BC"."RID")
   6 - access("B"."B_GROUP"=1)
   7 - access("B"."B_ID"="C"."B_ID")
  11 - access("A"."A_CODE"=1)
  12 - access("C"."A_CODE"=1)
0
On

Star transformations seem to have a Goldilocks zone for predicate selectivity, and your predicates are either too selective or not selective enough.

Per the Data Warehousing Guide, section 4.5.2.5 How Oracle Chooses to Use Star Transformation:

If the query requires accessing a large percentage of the rows in the fact table, it might be better to use a full table scan and not use the transformations. However, if the constraining predicates on the dimension tables are sufficiently selective that only a small portion of the fact table must be retrieved, the plan based on the transformation will probably be superior.

The predicate a.a_code = 1 is an equality condition on a primary key. Reading a unique index is almost always as fast as an operation can be, and Oracle will always choose that path if possible. On the other hand, the predicate b.b_group = 1 will select 10% of the rows, which is in full table scan territory, and is not an operation you want to run repeatedly in a subquery.

In your example, when I comment out the unique index and primary key:

--create unique index IX_join_back_a_code on join_back_a(a_code); 
--alter table join_back_a add constraint PK_dan_join_back_a primary key (a_code);

And change the 10% selectivity:

      , mod(rownum, 10)       b_group

To a 0.1% selectivity:

      , mod(rownum, 1000)       b_group

I can get a star transformation on my 19c database:

alter session set star_transformation_enabled=true;

explain plan for
select /*+ star_transformation */ *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1;

select * from table(dbms_xplan.display);
Plan hash value: 3923125903

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                            |     1 |   153 |   126   (1)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION               |                            |       |       |            |          |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D66A0_377EE48 |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED   | JOIN_BACK_B                |   100 |   900 |   101   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                     | IX_JOIN_BACK_B_GROUP       |   100 |       |     1   (0)| 00:00:01 |
|*  5 |   HASH JOIN                              |                            |     1 |   153 |    25   (4)| 00:00:01 |
|*  6 |    VIEW                                  | VW_ST_D5F377AC             |     1 |    39 |    16   (7)| 00:00:01 |
|   7 |     NESTED LOOPS                         |                            |     1 |    28 |    14   (8)| 00:00:01 |
|   8 |      BITMAP CONVERSION TO ROWIDS         |                            |       |    13 |    14   (8)| 00:00:01 |
|   9 |       BITMAP AND                         |                            |       |       |            |          |
|  10 |        BITMAP CONVERSION FROM ROWIDS     |                            |       |       |            |          |
|* 11 |         INDEX RANGE SCAN                 | IR_JOIN_BACK_C_A_CODE      |       |       |     5   (0)| 00:00:01 |
|  12 |        BITMAP MERGE                      |                            |       |       |            |          |
|  13 |         BITMAP KEY ITERATION             |                            |       |       |            |          |
|  14 |          TABLE ACCESS FULL               | SYS_TEMP_0FD9D66A0_377EE48 |   100 |   500 |     2   (0)| 00:00:01 |
|  15 |          BITMAP CONVERSION FROM ROWIDS   |                            |       |       |            |          |
|* 16 |           INDEX RANGE SCAN               | IR_JOIN_BACK_C_B_ID        |       |       |     3   (0)| 00:00:01 |
|  17 |      TABLE ACCESS BY USER ROWID          | JOIN_BACK_C                |     1 |    14 |     2   (0)| 00:00:01 |
|  18 |    MERGE JOIN CARTESIAN                  |                            |   100 | 11400 |     9   (0)| 00:00:01 |
|* 19 |     TABLE ACCESS FULL                    | JOIN_BACK_A                |     1 |   105 |     7   (0)| 00:00:01 |
|  20 |     BUFFER SORT                          |                            |   100 |   900 |     2   (0)| 00:00:01 |
|  21 |      TABLE ACCESS FULL                   | SYS_TEMP_0FD9D66A0_377EE48 |   100 |   900 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("B"."B_GROUP"=1)
   5 - access("C0"="ITEM_2" AND "A"."A_CODE"="ITEM_1")
   6 - filter("ITEM_1"=1)
  11 - access("C"."A_CODE"=1)
  16 - access("C"."B_ID"="C0")
  19 - filter("A"."A_CODE"=1)
 
Note
-----
   - star transformation used for this statement

I'm not sure if this answer will help you improve your query, but hopefully it can at least help explain why your query isn't working the way you want.