Redshift: Changing JOIN and WHERE filter from BIGINT to VARCHAR(128) destroys performance by expanding total scanned rows

993 Views Asked by At

Background

We have three tables in Redshift that we've been joining on a BIGINT column. For a variety of reasons we're being forced to change this key from an integer to a string. Once we do so, an important query we run often goes from resolving in approximately 15 seconds to taking over an hour.

Simplified Data Model

The three tables in play here are order_hours, locations, and events. A simplified version of the setup looks like this:

enter image description here

order_hours is just an ordered list of timestamps on the hour (2016-01-01 00:00:00, 2016-01-01 01:00:00, 2016-01-01 02:00:00, etc.). events belong to a location, and have a category, a rank, and start_at and end_at timestamps that connote arbitrary time intervals (each event is different, and could range from a few hours to a year or more).

The Query

As we attempt to transition from the core_id BIGINT unique id on locations to the id string, the performance of the following (simplified) query is being drastically impacted (again, from ~15 seconds to over an hour):

Old Query

WITH hourly_events_features AS (
    SELECT order_hour AS event_hour
        , core_location_id
        , location_id 
        , SUM(CASE WHEN category = 'community' THEN rank ELSE 0 END) AS community_rank_sum 
        , SUM(CASE WHEN category = 'public-holidays' THEN 1 ELSE 0 END) AS is_public_holidays
    FROM order_hours AS oh 
    INNER JOIN events AS e ON oh.order_hour BETWEEN e.start_at AND e.end_at
    WHERE e.status = 'active'
    GROUP BY order_hour, core_location_id, location_id
)

SELECT oh.order_hour
    , l.id AS location_id
    , l.core_id AS core_location_id
    , hef.community_rank_sum
    , hef.is_public_holidays
FROM order_hours AS oh
LEFT JOIN locations AS l on 1=1
LEFT JOIN hourly_events_features AS hef ON oh.order_hour = hef.event_hour AND l.core_id = hef.core_location_id
WHERE l.core_id = 1
AND order_hour <= (current_date + INTERVAL '30 day')::timestamp
ORDER BY oh.order_hour DESC

New Query

WITH hourly_events_features AS (
    SELECT order_hour AS event_hour
        , core_location_id
        , location_id 
        , SUM(CASE WHEN category = 'community' THEN rank ELSE 0 END) AS community_rank_sum 
        , SUM(CASE WHEN category = 'public-holidays' THEN 1 ELSE 0 END) AS is_public_holidays
    FROM order_hours AS oh 
    INNER JOIN events AS e ON oh.order_hour BETWEEN e.start_at AND e.end_at
    WHERE e.status = 'active'
    GROUP BY order_hour, location_id, core_location_id
)

SELECT oh.order_hour
    , l.id AS location_id
    , l.core_id AS core_location_id
    , hef.community_rank_sum
    , hef.is_public_holidays
FROM order_hours AS oh
LEFT JOIN locations AS l on 1=1
LEFT JOIN hourly_events_features AS hef ON oh.order_hour = hef.event_hour AND l.id = hef.location_id
WHERE l.id = 'randomstring'
AND order_hour <= (current_date + INTERVAL '30 day')::timestamp
ORDER BY oh.order_hour DESC

The Query Plan

When I run EXPLAIN on these two queries, the query plan is identical, save for one crucial difference. In the old query with JOIN and WHERE using the integer column, The WHERE l.core_id = 1 is hoisted up by the planner and also applied to the hourly_events_features CTE. In the new query's plan, this doesn't happen.

Old Plan

XN Merge  (cost=1164669244983.22..1164669245376.98 rows=157505 width=51)
  Merge Key: oh.order_hour
  ->  XN Network  (cost=1164669244983.22..1164669245376.98 rows=157505 width=51)
        Send to leader
        ->  XN Sort  (cost=1164669244983.22..1164669245376.98 rows=157505 width=51)
              Sort Key: oh.order_hour
              ->  XN Hash Right Join DS_DIST_OUTER  (cost=7164628982.54..164669231386.57 rows=157505 width=51)
                    Outer Dist Key: hef.event_hour
                    Hash Cond: (("outer".event_hour = "inner".order_hour) AND ("outer".core_location_id = "inner".core_id))
                    ->  XN Subquery Scan hef  (cost=7163906154.11..7163937655.01 rows=1575045 width=32)
                          ->  XN HashAggregate  (cost=7163906154.11..7163921904.56 rows=1575045 width=58)
                                ->  XN Nested Loop DS_BCAST_INNER  (cost=706003.19..7158174163.01 rows=458559288 width=58)
                                      Join Filter: (("outer".start_at <= "inner".order_hour) AND ("outer".end_at >= "inner".order_hour))
                                      ->  XN Seq Scan on events e  (cost=0.00..212320.02 rows=93609 width=66)
                                            Filter: ((core_location_id = 1) AND ((status)::text = 'active'::text))
                                      ->  XN Materialize  (cost=706003.19..706444.07 rows=44088 width=8)
                                            ->  XN Seq Scan on order_hours oh  (cost=0.00..551.10 rows=44088 width=8)
                                                  Filter: (order_hour <= '2020-12-17 00:00:00'::timestamp without time zone)
                    ->  XN Hash  (cost=722387.56..722387.56 rows=88176 width=35)
                          ->  XN Nested Loop DS_BCAST_INNER  (cost=72.94..722387.56 rows=88176 width=35)
                                ->  XN Seq Scan on order_hours oh  (cost=0.00..551.10 rows=44088 width=8)
                                      Filter: (order_hour <= '2020-12-17 00:00:00'::timestamp without time zone)
                                ->  XN Materialize  (cost=72.94..72.96 rows=2 width=27)
                                      ->  XN Seq Scan on locations l  (cost=0.00..0.94 rows=2 width=27)
                                            Filter: (core_id = 1)

New Plan

XN Merge  (cost=93130432391103.72..93130432428854.92 rows=15100480 width=51)
  Merge Key: oh.order_hour
  ->  XN Network  (cost=93130432391103.72..93130432428854.92 rows=15100480 width=51)
        Send to leader
        ->  XN Sort  (cost=93130432391103.72..93130432428854.92 rows=15100480 width=51)
              Sort Key: oh.order_hour
              ->  XN Hash Right Join DS_DIST_OUTER  (cost=17497042702.35..92130430590515.61 rows=15100480 width=51)
                    Outer Dist Key: hef.event_hour
                    Hash Cond: (("outer".event_hour = "inner".order_hour) AND (("outer".location_id)::text = ("inner".id)::text))
                    ->  XN Subquery Scan hef  (cost=17496319873.91..17499339969.77 rows=151004793 width=234)
                          ->  XN HashAggregate  (cost=17496319873.91..17497829921.84 rows=151004793 width=58)
                                ->  XN Nested Loop DS_BCAST_INNER  (cost=706003.19..16946774771.94 rows=43963608158 width=58)
                                      Join Filter: (("outer".start_at <= "inner".order_hour) AND ("outer".end_at >= "inner".order_hour))
                                      ->  XN Seq Scan on events e  (cost=0.00..176933.35 rows=8974607 width=66)
                                            Filter: ((status)::text = 'active'::text)
                                      ->  XN Materialize  (cost=706003.19..706444.07 rows=44088 width=8)
                                            ->  XN Seq Scan on order_hours oh  (cost=0.00..551.10 rows=44088 width=8)
                                                  Filter: (order_hour <= '2020-12-17 00:00:00'::timestamp without time zone)
                    ->  XN Hash  (cost=722387.56..722387.56 rows=88176 width=35)
                          ->  XN Nested Loop DS_BCAST_INNER  (cost=72.94..722387.56 rows=88176 width=35)
                                ->  XN Seq Scan on order_hours oh  (cost=0.00..551.10 rows=44088 width=8)
                                      Filter: (order_hour <= '2020-12-17 00:00:00'::timestamp without time zone)
                                ->  XN Materialize  (cost=72.94..72.96 rows=2 width=27)
                                      ->  XN Seq Scan on locations l  (cost=0.00..0.94 rows=2 width=27)
                                            Filter: ((id)::text = 'randomstring'::text)

The key difference is the line Filter: ((core_location_id = 1) AND ((status)::text = 'active'::text)), which in the new query doesn't get the final filter applied Filter: ((status)::text = 'active'::text). This affects the total number of rows scanned.

Dist and Sort Keys

I've tried a number of permutations of dist and sort keys, none of which have worked to improve the performance at all. For the example query plans above, these were the dist and sort keys:

  • Locations - Dist: id, Sort: id
  • Order Hours - Dist: order_hour, Sort: order_hour
  • Events - Dist: location_id, Sort: [start_at, end_at]

These were the dist and sort keys for both queries, so it doesn't appear that the old query is even getting helped out by dist/sort keys at all.

Question

Is there some combination of dist and sort keys I haven't tried, or some other trick to get the new query to perform like the old query? Is there some way to re-write the query, or some other Redshift tuning I can do to make this work?

Thanks!

1

There are 1 best solutions below

0
Doug Mill On

Thanks to a comment from @johnrotenstein, the following works:

WITH hourly_events_features AS (
    SELECT order_hour AS event_hour
        , core_location_id
        , location_id 
        , SUM(CASE WHEN category = 'community' THEN rank ELSE 0 END) AS community_rank_sum 
        , SUM(CASE WHEN category = 'public-holidays' THEN 1 ELSE 0 END) AS is_public_holidays
    FROM doug.order_hours AS oh 
    INNER JOIN doug.events AS e ON oh.order_hour BETWEEN e.start_at AND e.end_at
    WHERE e.status = 'active'
    GROUP BY order_hour, location_id, core_location_id
)

SELECT oh.order_hour
    , hef.location_id AS location_id
    , l.core_id AS core_location_id
    , hef.community_rank_sum
    , hef.is_public_holidays
FROM doug.order_hours AS oh
LEFT JOIN doug.locations AS l on 1=1
LEFT JOIN hourly_events_features AS hef ON oh.order_hour = hef.event_hour AND l.id = hef.location_id
WHERE hef.location_id = 'random-string'
AND order_hour <= (current_date + INTERVAL '30 day')::timestamp
ORDER BY oh.order_hour DESC

Apparently applying the WHERE on the location_id column from the expensive CTE rather than from the locations table itself, the planner gets the hint that it can apply the filter to the result set from the CTE as well as the overall result set, like it did automatically with the integer column. Performance is just a few seconds off with this solution from what we were getting with the integer join and filter, which is to be expected given the added size and computation required for string comparison.

The final plan:

XN Merge  (cost=1014169098190.65..1014169099868.39 rows=671097 width=242)
  Merge Key: oh.order_hour
  ->  XN Network  (cost=1014169098190.65..1014169099868.39 rows=671097 width=242)
        Send to leader
        ->  XN Sort  (cost=1014169098190.65..1014169099868.39 rows=671097 width=242)
              Sort Key: oh.order_hour
              ->  XN Hash Join DS_BCAST_INNER  (cost=7110469834.16..14169033241.34 rows=671097 width=242)
                    Hash Cond: (("outer".location_id)::text = ("inner".universal_id)::text)
                    ->  XN Hash Join DS_BCAST_INNER  (cost=7110469833.22..14168298979.57 rows=335549 width=234)
                          Hash Cond: ("outer".event_hour = "inner".order_hour)
                          ->  XN Subquery Scan hef  (cost=7110469171.56..7110509658.24 rows=2024334 width=234)
                                ->  XN HashAggregate  (cost=7110469171.56..7110489414.90 rows=2024334 width=249)
                                      ->  XN Nested Loop DS_BCAST_INNER  (cost=706371.49..7107743356.81 rows=218065180 width=249)
                                            Join Filter: (("outer".start_at <= "inner".order_hour) AND ("outer".end_at >= "inner".order_hour))
                                            ->  XN Seq Scan on pipeline_events  (cost=0.00..212320.02 rows=44492 width=257)
                                                  Filter: (((location_universal_id)::text = 'random-string'::text) AND ((status)::text = 'active'::text))
                                            ->  XN Materialize  (cost=706371.49..706812.60 rows=44111 width=8)
                                                  ->  XN Seq Scan on order_hours oh  (cost=0.00..551.38 rows=44111 width=8)
                                                        Filter: (order_hour <= '2020-12-18 00:00:00'::timestamp without time zone)
                          ->  XN Hash  (cost=551.38..551.38 rows=44111 width=8)
                                ->  XN Seq Scan on order_hours oh  (cost=0.00..551.38 rows=44111 width=8)
                                      Filter: (order_hour <= '2020-12-18 00:00:00'::timestamp without time zone)
                    ->  XN Hash  (cost=0.94..0.94 rows=2 width=27)
                          ->  XN Seq Scan on pipeline_locations  (cost=0.00..0.94 rows=2 width=27)
                                Filter: ('random-string'::text = (universal_id)::text)