How to avoid full table scan in mysql join query

90 Views Asked by At

Consider the following query:

     SELECT
      `banner`.`id`,
      `region`.*
      FROM
     `nms_section_region_banner` AS `section`
      JOIN `aw_rbslider_slide_region` AS `region`
      ON
      FIND_IN_SET(
          region.region_id,
          section.region_id
        ) <> 0
       JOIN `aw_rbslider_banner` AS `banner`
         ON
      `section`.`banner_id` = `banner`.`id`
      JOIN `aw_rbslider_slide_banner` AS `slide_banner`
         ON
      `slide_banner`.`banner_id` = `banner`.`id`
      JOIN `aw_rbslider_slide` AS `slide`
         ON
      `slide_banner`.`slide_id` = `slide`.`id` AND `slide`.`status` = 1
      JOIN `aw_rbslider_slide_store` AS `store`
         ON
      `slide`.`id` = `store`.`slide_id`
         WHERE
      `section`.`section_id` = '414' AND(
      `region`.`region_type` = '1' OR FIND_IN_SET('400020', region.region_code) <> 0 OR 
      FIND_IN_SET(
        'PANINDIABEAUTY',
        region.region_code
      ) <> 0 OR FIND_IN_SET(
        'PANINDIADIGITAL',
        region.region_code
      ) <> 0 OR FIND_IN_SET('6210', region.region_code) <> 0 OR FIND_IN_SET(
        'PANINDIAJEWEL',
        region.region_code
      ) <> 0 OR FIND_IN_SET('MH', region.region_code) <> 0 OR FIND_IN_SET('Mumbai', 
       region.region_code) <> 0
       ) AND(
       `slide`.`display_from` <= '2021-07-23 02:05:16' OR `slide`.`display_from` IS NULL OR 
    `slide`.`display_from` = '0000-00-00 00:00:00'
     ) AND(
     `slide`.`display_to` >= '2021-07-23 02:05:16' OR `slide`.`display_to` IS NULL OR 
    `slide`.`display_to` = '0000-00-00 00:00:00'
    ) AND(
    `store`.`store_id` = '0' OR `store`.`store_id` = '2'
   )
   GROUP BY
 `banner`.`id`
   ORDER BY
   FIELD(
    region.region_type,
    3,
    2,
    5,
    4,
    1
  )

Need to avoid the full table scan.

My query is being like,

Picture1 and picture 2 describes type, keys and possible keys information for the table

Can someone guide me to avoid full table scan on those 6 tables.

1

There are 1 best solutions below

0
DRapp On

First, a little cleanup so I can see and follow the hierarchy of your query and tables. Next, you are using a bunch of FIND_IN_SET() tests against the region code. From what this implies, your region code is a capacity of a long string of multiple values such that a region might be "MH, 400020, PANIDIAJEWEL, ETC", so you are looking for some "keyword" value within the region code. Is this accurate? -- OR -- does the region_code only have a single value. Please confirm.

With your join from section to region, they are both just "ID" keys, dont use Find_In_Set(), instead, direct equality. You can not optimize a join based on a function (hence my change) and MAY be a big issue on your query

For your group by, you originally had banner.id, but since that is already equal to section.banner_id via the join, and the section is the primary table, the index on section table can help optimize that grouping vs secondary table.

 SELECT
        section.banner_id id,
        region.*
    FROM
        nms_section_region_banner section
            JOIN aw_rbslider_slide_region region
                ON section.region_id = region.region_id

            JOIN aw_rbslider_banner banner
                ON section.banner_id = banner.id

            JOIN aw_rbslider_slide_banner slide_banner
                ON section.banner_id = slide_banner.banner_id

                JOIN aw_rbslider_slide slide
                    ON slide_banner.slide_id = slide.id 
                    AND slide.status = 1
            
                JOIN aw_rbslider_slide_store store
                    ON slide_banner.slide_id = store.slide_id
                    -- if IDs are integer, dont wrap in quotes
                    AND ( store.store_id in ( 0, 2 ) )
    WHERE
            -- dont use quotes if IDs are actually numbers
            section.section_id = 414 

        AND (   -- unsure if region_type is integer vs string...
                region.region_type = '1' 
            OR  FIND_IN_SET( '400020', region.region_code ) <> 0 
            OR  FIND_IN_SET( 'PANINDIABEAUTY', region.region_code ) <> 0 
            OR  FIND_IN_SET( 'PANINDIADIGITAL', region.region_code ) <> 0 
            OR  FIND_IN_SET( '6210', region.region_code) <> 0 
            OR  FIND_IN_SET( 'PANINDIAJEWEL', region.region_code ) <> 0 
            OR  FIND_IN_SET( 'MH', region.region_code) <> 0 
            OR  FIND_IN_SET( 'Mumbai', region.region_code) <> 0 ) 

        AND (  slide.display_from IS NULL 
            OR slide.display_from = '0000-00-00 00:00:00' 
            OR slide.display_from <= '2021-07-23 02:05:16' ) 

        AND (  slide.display_to IS NULL 
            OR slide.display_to = '0000-00-00 00:00:00' 
            OR slide.display_to >= '2021-07-23 02:05:16' )

    GROUP BY
        section.banner_id
    ORDER BY
        FIELD( region.region_type,
                3,
                2,
                5,
                4,
                1 )

To also help, I am sure indexes already exist on primary keys. But if you have compisite keys for the primary ID and the key to the next table, that can help. In addition, a covering index to include other fields used within where/group possibilities can help. I would try to have the following indexes.

Table                      Index
nms_section_region_banner  ( banner_id, region_id)  -- and in this specific order
aw_rbslider_slide_region   ( region_id, region_type, region_code )
aw_rbslider_slide_banner   ( banner_id, slide_id)
aw_rbslider_slide slide    ( id, status, display_from, display_to )
aw_rbslider_slide_store    ( slide_id, store_id )

Finally, your ORDER BY clause by doing the FIELD() function vs individually naming the field columns vs numbers. Having explicit field names from the region table is more explicit and readable