The problem I found that I cannot make Postgres use GIN index when I use jsonb_* functions in my queries. The problem exists for both jsonb_ops and jsonb_path_ops operator classes.
Let's first make preparations. Create the table
CREATE TABLE applications
(
id TEXT PRIMARY KEY,
application JSONB
);
CREATE INDEX ON applications USING gin (application jsonb_path_ops);
Let's fill the table with some bulk of data (to make Postgres use GIN index)
INSERT INTO applications(id, application)
VALUES ('1', '{
"type_code": 1,
"persons": [
{
"type_code": 4,
"firstname": "John",
"lastname": "Doe"
}
]
}');
INSERT INTO applications (SELECT i, a.application FROM applications a, generate_series(2, 100000) i);
Then I try to select the data like the following:
EXPLAIN ANALYZE
SELECT * FROM applications
WHERE applications.application @? '$.persons[*] ? (@.type_code == 3)';
Note that GIN index was used
-- Bitmap Heap Scan on applications (cost=64.00..68.01 rows=1 width=130) (actual time=0.410..0.419 rows=0 loops=1)
-- " Recheck Cond: (application @? '$.""persons""[*]?(@.""type_code"" == 3)'::jsonpath)"
-- -> Bitmap Index Scan on applications_application_idx (cost=0.00..64.00 rows=1 width=0) (actual time=0.095..0.096 rows=0 loops=1)
-- " Index Cond: (application @? '$.""persons""[*]?(@.""type_code"" == 3)'::jsonpath)"
-- Planning Time: 1.493 ms
-- Execution Time: 0.861 ms
Now I try to select the data like this:
EXPLAIN ANALYZE
SELECT * FROM applications
WHERE jsonb_path_exists(
applications.application,
'$.persons[*] ? (@.type_code == 3)'
);
You can see that GIN index was not used in this case:
-- Aggregate (cost=3374.33..3374.34 rows=1 width=8) (actual time=114.048..114.055 rows=1 loops=1)
-- -> Seq Scan on applications (cost=0.00..3291.00 rows=33333 width=0) (actual time=0.388..109.580 rows=100000 loops=1)
-- " Filter: jsonb_path_exists(application, '$.""persons""[*]?(@.""type_code"" == 3)'::jsonpath, '{}'::jsonb, false)"
-- Planning Time: 1.514 ms
-- Execution Time: 114.674 ms
Is it possible to make Postgres use GIN index in the second query?
Using jsonb_* functions is preferred for me because I can use positional parameters to build query:
SELECT * FROM applications
WHERE jsonb_path_exists(
applications.application,
'$.persons[*] ? (@.type_code == $person_type_code)',
jsonb_build_object('person_type_code', $1)
);
You can't*. PostgreSQL indexes are tied to operators in specific operator classes:
GIN will help you only if you use the operators in the opclass you used when you defined the index (
jsonb_opsby default):Even though there are equivalent
jsonb_path_X()functions that do the exact same thing those operators do, the index will only kick in if you use the operator and not the function.*Except you kind of can
There are cases like PostGIS where functions do in fact use the index but that's because they wrap an operator or add an operator-based condition that's using the index, then use the actual function to just re-check pre-filtered rows. You can mimmick that if you want: demo
You can see now it uses the index because the condition got rewritten as the operator it was wrapping. It finds 22219 matches because I increased the sample set to 200k and randomised the rows.