Oracle Version: 11.1.0.7.0
We are having higher IO Wait in one of our Oracle RAC instance
One SQL is having high elapsedtime by execution - 1452.57s per execution. This started happening suddenly one day. Previously, it was taking max 3-4 min to query 20k(:v4 parameter) records
subscribeinfo records: 59 million (non - parallel)
chargerate records : 2k - 3k
SQL
select o.msisdn, o.spid, o.serviceid , o.ChargeReferenceID,
o.channelID, o.nextchargetime , o.failtimestamp, o.lastmonfeeday,
o.networkId, o.retryEndDateTime, o.trialType, o.subFlag, o.faultCode
from subscribeinfo o, chargerate r where (o.monthbillid = :v1) and
(((o.state = :"SYS_B_00") and (o.nextchargetime < :v2) and
((o.IsAutoExtend <> :"SYS_B_01") or ((o.IsAutoExtend = :"SYS_B_02")
and (o.extendflag <> :"SYS_B_03")))) or (o.subFlag = :"SYS_B_04" and
o.state = :"SYS_B_05" and o.retryenddatetime > :v2)) and
(o.ChargeClassForSub = r.chargeclassidx) and ((r.chargemode =
:"SYS_B_06" and r.activetype = :"SYS_B_07" and o.nextchargetime !=
:"SYS_B_08" ) or ( r.chargemode = :"SYS_B_09" and r.activetype <>
:"SYS_B_10") or (r.chargemode >= :"SYS_B_11" and r.chargemode <=
:"SYS_B_12" and r.basecharge >= :"SYS_B_13") or (r.chargemode =
:"SYS_B_14") or (r.chargemode = :"SYS_B_15") or (r.chargemode =
:"SYS_B_16") ) and (o.failtimestamp <= :v3) and (rownum <= :v4)
According to AWR report Top 5 Timed Foreground Events
Direct path read [ Avg Wait Time: 22 s, %DB Time: 50.75% ]
DB file sequential read [ Avg Wait Time: 15 s, %DB Time: 38.00 ]
I will not be able to post full AWR report, because it is restricted. So please ask details I'll post
Explain plan
ID Exec Ord Operation Go To More Peek Bind Capt Bind Cost2 Estim
Card LAST Starts LAST Output Rows LAST Over/Under
Estimate1 PStart PStop Work Area 0 7 SELECT STATEMENT
23335 1 2577 1 6 COUNT STOPKEY [+] [+]
[+] 23335 1 2577 2 5 . HASH JOIN [+] [+]
[+] 23335 20001 1 2577 8x over [+] 3 1 .. TABLE ACCESS FULL
CHARGERATE [+] [+] 68 3035 1 3036 1x 4 4 .. PARTITION LIST
SINGLE [+] 23266 25223 1 2577 10x over KEY KEY 5 3 ... TABLE
ACCESS BY LOCAL INDEX ROWID SUBSCRIBEINFO [+] [+] [+]
[+] 23266 25223 1 2577 10x over KEY KEY 6 2 .... INDEX RANGE SCAN
IDX_FAILTIMESTAMP_NEW [+] [+] [+] [+] 2435 1 2100765 KEY KEY
IOSTAT
Linux 2.6.16.46-0.12-smp (mdspdb01) 11/16/12
avg-cpu: %user %nice %system %iowait %steal %idle
8.41 0.00 9.38 13.25 0.00 67.67
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 5.71 39.53 121.79 665679995 2051190222
sdb 85.75 178.15 171.12 3000316741 2881953582
sdc 111.05 161.69 43.96 2723201251 740429949
We created an index for the fields monthbillid, nextchargetime and failtimestamp... Eventhough it improved a lot in cardinality by 1/6th, it increased the cost by 4-5 times. But oracle takes the new index by default
create index IDX_MONTHBILLQUERY on subscribeinfo(monthbillid,
nextchargetime, failtimestamp) local tablespace IMUSE_INDEX;
dbms_stats.gather_index_stats('IMUSE01', 'IDX_MONTHBILLQUERY');
We have hard parses = 0 in AWR reports. And also we changed the cursor_sharing = FORCE
Now IO is under control. Still feels, this not the root cause. And also, we made the instance dedicate for this query which happens more than 10 times an hour also, it takes approximately 100 seconds to retrieve 20k records.
Can anyone suggest whether it is a good decision if I go for optimizer mode as first_rows or use an hint first_rows(20000)?
As of now, we have disabled stats job; can we enable the same thing only for some tables or some indexes. Is this possible?
The problem solved..... The cursor_sharing is made to force... This reduced the IO very much. Now IO is normal in all instance. Then we created two indexes for the same query which recommended by sqltuning advisor and accepted the profile