I have two tables:
- trips: id_trip, id_object, trip_date, delta (8980026 rows)
- ranges: id_range, id_object, date_since, date_until (18490 rows)
I need to optimize the following select statement
select r.id_range, sum(t.delta) sum_deltas
from trips t,
ranges r
where t.id_object = r.id_object
and t.trip_date between r.date_since and r.date_until
group by r.id_range
according to the condition there is always exactly one matching row for trip in 'ranges'
- the trips table is constantly growing, but there are no updates or deletions
- table ranges may change from time to time in any way (deletions, updates, inserts), so function based index is not the way :(
- there are indexes on id_object (in both tables) and date_since (in trips)
Does anyone have an idea how to speed things up, is it even possible?
It's always possible to speed things up; it just may not be worth the time / effort / money / disk-space / additional overheads etc.
Firstly please use the explicit join syntax. It's been the SQL standard for a few decades now and it helps avoid a lot of potential errors. Your query would become:
This query would imply that you need two indexes - unique if possible. On
rangesyou should have an index onid_object, date_since, date_until. The index ontripswould beid_object, trip_date. Iftripswere smaller I might consider addingdeltaon to the end of that index so you never enter the table at all but only do a index scan. As it stands you're going to have to do a table access by index rowid.Having written all that your problem may be slightly different. You're going to be full-scanning both tables with this query. Your problem might be the indexes. If the optimizer is using the indexes then it's possible you're doing an index unique/range scan for each
id_objectintripsorrangesand then, because of the use of columns not in the indexes you will be doing an table access by index rowid. This can be massively expensive.Try adding a hint to force a full-scan of both tables: