How to select all groups of intersecting polygons from a single table using PostGIS

39 Views Asked by At

Given a table of groups of intersecting polygons as shown.enter image description here(https://i.stack.imgur.com/hlXs1.png) How can we select all the groups of polygons that intersect?

The following query, using recursion, gets some of the way there but it duplicates all the groups and groups with multiple overlaps duplicate polygons within the group.

WITH RECURSIVE cte AS (

     SELECT id AS root_id,
            id, ARRAY[id] AS path,
            geom
       FROM polygons

      UNION ALL
     SELECT cte.root_id,
            t.id, path || t.id,
            t.geom
       FROM polygons AS t,
            cte
      WHERE t.id <> ALL(cte.path)
        AND ST_Intersects(t.geom, cte.geom) 
)

SELECT root_id, ARRAY_AG(id)
  FROM cte
 GROUP BY root_id
 ORDER BY root_id

The resulting selection looks like this:

root_id array_agg
1 1,2
2 2,1
3 3,4,5,6
4 4,3,5,6
5 5,4,6,3
6 6,5,4,3
7 7,8,9,9,8
8 8,7,9,9,7
9 9,7,8,8,7
10 10

As you will notice, selections with root_id 1 and 2 contain the same polygons and root_id 7, 8 and 9 all duplicate polygons within the group.

1

There are 1 best solutions below

0
Zegarek On BEST ANSWER

There's a function for exactly that, named exactly like that: ST_ClusterIntersectingWin(). Added in PostGIS 3.4.0.
Demo at db<>fiddle:

enter image description here

explain analyze verbose 
create table my_shapes_clustered2 as
select 1+ST_ClusterIntersectingWin(geom)over() as cluster_number, *
from my_shapes;
QUERY PLAN
WindowAgg (cost=0.00..15091.00 rows=1200 width=391) (actual time=5.639..5.948 rows=1200 loops=1)
  Output: (1 + st_clusterintersectingwin(geom) OVER (?)), id, initial_seed, geom
  -> Seq Scan on public.my_shapes (cost=0.00..76.00 rows=1200 width=387) (actual time=0.006..0.215 rows=1200 loops=1)
        Output: geom, id, initial_seed
Planning Time: 0.063 ms
Execution Time: 8.455 ms

Inspecting some samples:

select cluster_number,st_area(st_union(geom)),count(*),array_agg(id)
from my_shapes_clustered2 
group by cluster_number 
order by cluster_number 
limit 8;
cluster_number st_area count array_agg
1 836.248837776196 2 {798,1}
2 651.4061064388445 1 {2}
3 53.56050595625381 1 {3}
4 269.4408305746047 1 {4}
5 1596.5821681225993 4 {507,5,676,72}
6 1262.1100035153077 3 {186,296,6}
7 2008.7632476757103 4 {474,891,7,1121}
8 66.289777695502 1 {8}

In PostGIS 2.3.0+ you can get the same result, although slightly slower, using the ST_ClusterDBSCAN(geom,0,1) function suggested by @JGH:

explain analyze verbose 
create table my_shapes_clustered3 as
select 1+ST_ClusterDBSCAN(geom,0,1)over() as cluster_number, *
from my_shapes;
QUERY PLAN
WindowAgg (cost=0.00..15091.00 rows=1200 width=391) (actual time=17.688..18.018 rows=1200 loops=1)
  Output: (1 + st_clusterdbscan(geom, '0'::double precision, 1) OVER (?)), id, initial_seed, geom
  -> Seq Scan on public.my_shapes (cost=0.00..76.00 rows=1200 width=387) (actual time=0.005..0.218 rows=1200 loops=1)
        Output: geom, id, initial_seed
Planning Time: 0.041 ms
Execution Time: 21.007 ms

In 2.2.0 there's a bit more clunky ST_ClusterIntersecting() that's spitting out collections.