If composite indexing created - indexing is called?

40 Views Asked by At

What is the relationship between "Composite indexing" and "Index Skip Scan"?

If you create a composite indexing on 3 columns eid, ename, esal?

  • If I mention only eid=10 after where clause will the indexing be called ?

    select * from emp where eid=10;
    
  • If I mention only eid=10 and ename='Raj' will the indexing be called ?

    select * from emp where eid=10 and ename='Raj';
    
  • If I mention in different order like esal=1000 and eid=10 will the indexing be called ?

    select * from emp where esal=1000 and eid=10;
    
  • If I mention in reverse order like esal = 1000 and ename = 'Raj' and eid = 10 will the indexing be called ?

    select * from emp where esal=1000 and enam='Raj' and eid=10;
    

Need a solution for this with detail table representation with data how it does?

1

There are 1 best solutions below

2
Paul W On

In all four cases you presented, Oracle would do a normal index seek (binary tree search) on the leading column eid because an equality predicate was provided for this column in each of your cases. But what else it does with the index depends on what other columns you're filtering on:

  1. First column (eid) only: after finding the first eid=10 entry in the leaf blocks using a binary search/seek, Oracle will scan that and any subsequent leaf blocks it needs to, moving through each block (single block reads) using a linked list, until it finds the first eid value that is not 10. As it does so it gathers ROWIDs containing the table segment physical row address which it issues single block reads by ROWID to obtain the rest of the row.

  2. First two columns (eid and ename): because ename is the second column in the index, Oracle will use both eid and ename together at the same time as it performs the binary search (seek) to find the first leaf block entry where eid=10 and ename='raj'. It will then procede as above scanning leaf blocks until it finds the first row for which either of these columns have different values.

  3. First and third column (eid and esal): because esal is the third column and you're skipping the second column, Oracle cannot use a single binary search/seek operation on esal. It has two choices:

    3a. It does a binary search/seek only on the leading column, eid, but once it finds the first eid=10 value in the leaf blocks it will do a normal scan of leaf blocks following that linked list - looking through all eid=10 rows, but grabbing ROWIDs only for those with esal=1000.

    3b. Or, it does a skip scan: for every distinct value of the missing intermediate column(s) (ename) within the scope defined by any leading columns (eid=10 rows), it will do a separate binary search/seek on the combined eid=10 / esal=1000 value. This is a seek not a scan, but it is potentially many seeks. If there are many ename values for eid=10 this results in a lot of unnecessary single block I/O and can perform poorly. But if there are only a few values, it works pretty well.

  4. All columns: Your last example would do a single binary search/seek on all three columns. Nothing special here.

You didn't offer this as an example, but to complete the study:

  1. Third column only: If you queried on esal=1000 only, Oracle could do one of the following:

    5a. Forget the index and scan the table itself (if 1000 is common)

    5b. Do a full scan (scattered read) of 100% of the leaf blocks of the index (if 1000 is uncommon but there are many eid/ename values). This is generally not a great option because in most systems parallel query usign direct path read to read the table itself is much more efficient than a serial scattered read of the index segment.

    5c. Do a skip-scan, which means a binary search/seek for esal=1000 for every single distinct combination of the preceding,unfiltered columns (eid and ename). That would be a lot of seeks, so is rather unlikely the optimizer would choose it unless it believes there aren't very many eid/ename values.

Whenever Oracle has a choice, it all depends on statistics and expected cardinalities from each operation which is largely driven by the min/max and # distinct values known for each column combined with overall table row counts. Of course you can force it with hints to if you think you know better than the statistics, but it is recommended to hold off on hinting until one has a solid grasp of how Oracle queries work internally, as you can easily tell it to do the wrong thing.

In conclusion, index column order matters a great deal. It doesn't have to be perfect, as you don't want dozens of indexes on a table, so you have to compromise a little, but carefully considering column order within composite indexes is an important modeling consideration based on the kinds of queries expected or present.