Can one define a pg_hint_plan with two join directions?

29 Views Asked by At

There are several large tables involved in a search query. I'm responsible for the operation of the database but sadly can't influence the incoming queries from the application. Postgres misjudges the selectivity of these large tables and produces suboptimal execution plans. I was able to resolve an issue with a different query using pg_hint_plan but have problems with this one.

There is a search for a title of a product, or the German translation of the title, or the English translation, but only if the German translation is not available.

The query looks like this:


select *
from product
left outer join translation as t on
(product.textnr = t.textnr and 
    (
        (t.language = 'DE' and t.text <> '')
        or 
        (t.language = 'EN' and t.text <> '' and not exists ( 
            select text
            from translation
            where translation.textnr = product.textnr
            and (translation.language = 'DE')
            and translation.text <> ''))
    )
)
where (product.title = 'query' or translation.text like 'query%')

This query often takes around 40 seconds.

My problem is that:

  • If only the product tile restriction is present (product.title = 'query'), I can get the query fast (<100ms) by filtering on product table an join the translation later on (/*+ Leading((product t)) NestLoop(product t) */)
  • If only the translation restriction is present (translation.text like 'query%'), I can get the query fast (~100ms) by filtering the other way round (/*+ Leading((t product )) NestLoop(t product) */)
  • But if both are present in the where clause, I don't have a starting point because of the full outer join on textnr.

Is there a way to tell postgres/pg_hint_plan to do like a Hash Join of two different joins of the same base tables? Something in the line of:

/*+ Leading((t product ) (product t)) HashJoin ( NestLoop(t product) NestLoop(product t) )*/

If I try to use the same table name twice in the Leading hint there is an error message: pg_hint_plan: hint syntax error at or near "(product t))"

edit: execution plan

QUERY PLAN                                                                                                                                                                                                                                                     |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Hash Join  (cost=1693149.03..35181808.38 rows=25422 width=187) (actual time=19191.927..45245.126 rows=28 loops=1)                                                                                                                                              |
  Output: product.textnr, katalog[...]
  Hash Cond: ((katalog.[some additional table ...]
  Buffers: shared hit=25454832 read=160622, temp read=26535 written=26535                                                                                                                                                                                      |
  I/O Timings: shared/local read=501.711, temp read=45.572 write=184.356                                                                                                                                                                                       |
  ->  Nested Loop  (cost=1689535.09..35177844.89 rows=25422 width=187) (actual time=13131.144..45210.910 rows=64 loops=1)                                                                                                                                      |
        Output: product.textnr, katalog[...]
        Buffers: shared hit=25454799 read=159756, temp read=26535 written=26535                                                                                                                                                                                |
        I/O Timings: shared/local read=497.810, temp read=45.572 write=184.356                                                                                                                                                                                 |
        ->  Hash Left Join  (cost=1689534.65..35132069.03 rows=18000 width=69) (actual time=13130.004..45153.182 rows=89 loops=1)                                                                                                                              |
              Output: product.textnr, product....
              Hash Cond: ((product.textnr)::text = (t.textnr)::text)                                                                                                                                                                                          |
              Join Filter: (((t.language)::text = 'DE'::text) OR (((t.language)::text = 'EN'::text) AND (NOT (SubPlan 1))))                                                                                                                                  |
              Rows Removed by Join Filter: 3998465                                                                                                                                                                                                             |
              Filter: (((product.title)::text = 'query'::text) OR (t.text)::text) ~~ 'query%'::text))                                                                                                                                             |
              Rows Removed by Filter: 8088550                                                                                                                                                                                                                  |
              Buffers: shared hit=25454564 read=159664, temp read=26535 written=26535                                                                                                                                                                          |
              I/O Timings: shared/local read=440.991, temp read=45.572 write=184.356                                                                                                                                                                           |
              ->  Seq Scan on public.product  (cost=0.00..375433.59 rows=8119159 width=69) (actual time=0.006..3219.008 rows=8088639 loops=1)                                                                                                                      |
                    Output: product.title, ...
                    Buffers: shared hit=179827 read=114415                                                                                                                                                                                                     |
                    I/O Timings: shared/local read=322.814                                                                                                                                                                                                     |
              ->  Hash  (cost=1595405.93..1595405.93 rows=4049898 width=63) (actual time=8272.352..8272.353 rows=1628548 loops=1)                                                                                                                              |
                    Output: t.textnr, t.language, t.text                                                                                                                                                                                              |
                    Buckets: 1048576  Batches: 8  Memory Usage: 24151kB                                                                                                                                                                                        |
                    Buffers: shared hit=1278468 read=44483, temp written=11703                                                                                                                                                                                 |
                    I/O Timings: shared/local read=102.369, temp write=71.090                                                                                                                                                                                  |
                    ->  Index Scan using translation2 on public.translation t  (cost=0.69..1595405.93 rows=4049898 width=63) (actual time=326.774..7775.822 rows=1628548 loops=1)                                                                                     |
                          Output: t.textnr, t.language, t.text                                                                                                                                                                                        |
                          Index Cond: [additional join condition]                                                                                                                                                                               |
                          Filter: ((t.text::text) <> ''::text) AND (((t.language)::text = 'DE'::text) OR ((t.language)::text = 'EN'::text)))                                                                                                    |
                          Rows Removed by Filter: 5377105                                                                                                                                                                                                      |
                          Buffers: shared hit=1278468 read=44483                                                                                                                                                                                               |
                          I/O Timings: shared/local read=102.369                                                                                                                                                                                               |
              SubPlan 1                                                                                                                                                                                                                                        |
                ->  Index Scan using translation_pkey on public.translation  (cost=0.69..3.42 rows=2 width=0) (actual time=0.007..0.007 rows=1 loops=3998465)                                                                                                          |
                      Index Cond: ((([additional join condition]) AND ((translation.textnr)::text = (product.textnr)::text) AND ((translation.language)::text = 'DE'::text))                                                                          |
                      Filter: (upper((translation.text)::text) <> ''::text)                                                                                                                                                                                    |
                      Buffers: shared hit=23993142 read=766                                                                                                                                                                                                    |
                      I/O Timings: shared/local read=15.808                                                                                                                                                                                                    |
        ->  Index Scan using katalog3 on public.katalog  (cost=0.43..2.14 rows=40 width=132) (actual time=0.565..0.646 rows=1 loops=89)                                                                                                                        |
              Output: katalog.[...]|
              Index Cond: [...]                                                                                                                                   |
              Buffers: shared hit=235 read=92                                                                                                                                                                                                                  |
              I/O Timings: shared/local read=56.818                                                                                                                                                                                                            |
  ->  Hash  (cost=2105.64..2105.64 rows=120664 width=17) (actual time=34.023..34.024 rows=120169 loops=1)                                                                                                                                                      |
        Output: [additional data table colums...]                                                                                                                                                                                                               |
        Buckets: 131072  Batches: 1  Memory Usage: 6820kB                                                                                                                                                                                                      |
        Buffers: shared hit=33 read=866                                                                                                                                                                                                                        |
        I/O Timings: shared/local read=3.901                                                                                                                                                                                                                   |
        ->  Seq Scan on public.[additional data table]  (cost=0.00..2105.64 rows=120664 width=17) (actual time=0.861..13.874 rows=120169 loops=1)                                                                                                                   |
              Output: [additional data table colums...]                                                                                                                                                                                                         |
              Buffers: shared hit=33 read=866                                                                                                                                                                                                                  |
              I/O Timings: shared/local read=3.901                                                                                                                                                                                                             |
Settings: effective_cache_size = '16094504kB', jit = 'off', random_page_cost = '1.1', temp_buffers = '64MB', work_mem = '32MB'                                                                                                                                 |
Query Identifier: 703631274929839679                                                                                                                                                                                                                           |
Planning:                                                                                                                                                                                                                                                      |
  Buffers: shared hit=63                                                                                                                                                                                                                                       |
Planning Time: 1.220 ms                                                                                                                                                                                                                                        |
Execution Time: 45245.280 ms                                                                                                                                                                                                                                   |
0

There are 0 best solutions below