Postgresql planner won't select index for 'NOT' queries

152 Views Asked by At

[Title updated to reflect updates in description]

I am running Postgresql 9.6

I have a complex query that isn't using the indexes that I expect, when I break it down to this small example I am lost as to why the index isn't being used.

These examples run on a table with 1 million records, and currently all records have the value 'COMPLETED' for column state. State is a text column and I have a btree index on it.

The following query uses my index as I'd expect:

explain analyze
SELECT * FROM(
    SELECT
    q.state = 'COMPLETED'::text AS completed_successfully
    FROM request.request q
) a where NOT completed_successfully;

V

                                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using request_state_index on request q  (cost=0.43..88162.19 rows=11200 width=1) (actual time=200.554..200.554 rows=0 loops=1)
   Filter: (state <> 'COMPLETED'::text)
   Rows Removed by Filter: 1050005
   Heap Fetches: 198150
 Planning time: 0.272 ms
 Execution time: 200.579 ms
(6 rows)

But if I add anything else to the select that references my table, then the planner chooses to do a sequential scan instead.

explain analyze
SELECT * FROM(
    SELECT
    q.state = 'COMPLETED'::text AS completed_successfully,
    q.type
    FROM request.request q
) a where NOT completed_successfully;

V

                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Seq Scan on request q  (cost=0.00..234196.06 rows=11200 width=8) (actual time=407.713..407.713 rows=0 loops=1)
   Filter: (state <> 'COMPLETED'::text)
   Rows Removed by Filter: 1050005
 Planning time: 0.113 ms
 Execution time: 407.733 ms
(5 rows)

Even this simpler example has the same issue.

Uses Index:

SELECT
q.state
FROM request.request q
WHERE q.state = 'COMPLETED';

Doesn't use Index:

SELECT
q.state, 
q.type
FROM request.request q
WHERE q.state = 'COMPLETED';

[UPDATE] I now understand (for this case) that the index it's using there is INDEX ONLY, and it would stop using that in this case because type isn't also in the index. So the question perhaps is why won't it use it in the 'Not' case below:

When I use a different value that isn't in the table, i knows to use the index (which makes sense):

SELECT
q.state, 
q.type
FROM request.request q
WHERE q.state = 'CREATED';

But if I not it, it doesn't:

SELECT
q.state, 
q.type
FROM request.request q
WHERE q.state != 'COMPLETED';

Why is my index not being used?

What can I do to ensure it gets used?

Most of the time, I expect nearly all the records in this table to be in one of many end states (using IN operator);. So when running my more complex query, I expect these records should be excluded from my more expensive part of the query early and quickly.

[UPDATES]

It looks like 'NOT' is not a supported B-Tree operation. I'll need some kind of unique approach: https://www.postgresql.org/docs/current/indexes-types.html#INDEXES-TYPES-BTREE

I tried adding the following partial indexes but they didn't seem to work:

CREATE INDEX request_incomplete_state_index ON request.request (state) WHERE state NOT IN('COMPLETED', 'FAILED', 'CANCELLED');
    
CREATE INDEX request_complete_state_index ON request.request (state) WHERE state IN('COMPLETED', 'FAILED', 'CANCELLED');

This partial index does work, but is not an ideal solution.

CREATE INDEX request_incomplete_state_exact_index ON request.request (state) WHERE state != 'COMPLETED';

explain analyze SELECT q.state, q.type FROM request.request q WHERE q.state != 'COMPLETED';

I also tried this expression index, while also not ideal also didn't work:

CREATE OR REPLACE FUNCTION request.request_is_done(in_state text)
 RETURNS BOOLEAN
 LANGUAGE sql
 STABLE
AS $function$
    SELECT in_state IN ('COMPLETED', 'FAILED', 'CANCELLED');
$function$
;

CREATE INDEX request_is_done_index ON request.request (request.request_is_done(state));

explain analyze select * from request.request q where NOT request.request_is_done(state);

Using a list (In Clause) of states with equals works. So I may have to figure out my larger query to just not use the NOT.

0

There are 0 best solutions below