I'm currently making consistent queries for a block of land within a given latitude, longitude rectangle. The coordinates are stored as individual double precision values. I've created a single index of both columns, so the current query containing 15240 tiles takes .10 seconds on my local machine.
At the moment, there's 23 million rows in the table, but there's going to be around 800 million upon completion of the table, so I expect this query time to get much slower.
Here's the query I'm running, with example values:
SELECT * FROM territories
WHERE nwlat < 47.606977 and nwlat > 47.506977
and nwlng < -122.232991 and nwlng > -122.338991;
Is there a more efficient way of doing this? I'm fairly new to large databases, so any help is appreciated. FYI, I'm using PostgreSQL.
It would be much more efficient with a GiST or an SP-GiST index and a "box-contains-points" query ...
GiST index
The index is on a box with zero area, built from the same point (
point(nwlat, nwlng)
) twice.There is a related code example in the manual for
CREATE INDEX
.Query with the "overlaps" operator
&&
:SP-GiST index
Smaller index on just points:
Query with the contains operator
@>
:I get fastest results for the SP-GiST index in a simple test with 1M rows on Postgres 9.6.1.
For more sophisticated needs consider the PostGIS extension.