I have two tables:
- trips: id_trip, id_object, trip_date, delta (8980026 rows)
- ranges: id_range, id_object, date_since, date_until (18490 rows)
I need to optimize the following select statement
select r.id_range, sum(t.delta) sum_deltas
from trips t,
ranges r
where t.id_object = r.id_object
and t.trip_date between r.date_since and r.date_until
group by r.id_range
according to the condition there is always exactly one matching row for trip in 'ranges'
- the trips table is constantly growing, but there are no updates or deletions
- table ranges may change from time to time in any way (deletions, updates, inserts), so function based index is not the way :(
- there are indexes on id_object (in both tables) and date_since (in trips)
Does anyone have an idea how to speed things up, is it even possible?
It's always possible to speed things up; it just may not be worth the time / effort / money / disk-space / additional overheads etc.
Firstly please use the explicit join syntax. It's been the SQL standard for a few decades now and it helps avoid a lot of potential errors. Your query would become:
This query would imply that you need two indexes - unique if possible. On
ranges
you should have an index onid_object, date_since, date_until
. The index ontrips
would beid_object, trip_date
. Iftrips
were smaller I might consider addingdelta
on to the end of that index so you never enter the table at all but only do a index scan. As it stands you're going to have to do a table access by index rowid.Having written all that your problem may be slightly different. You're going to be full-scanning both tables with this query. Your problem might be the indexes. If the optimizer is using the indexes then it's possible you're doing an index unique/range scan for each
id_object
intrips
orranges
and then, because of the use of columns not in the indexes you will be doing an table access by index rowid. This can be massively expensive.Try adding a hint to force a full-scan of both tables: