How to find polygons that contain a given point in Elasticsearch

458 Views Asked by At

I need to build a query on a database with around 50k terrain polygons (stored as geo_shape polygons on ES) where I give a point and it returns every polygon that contains this point.

I managed to do it using percolate queries (example below) but I read somewhere that percolate queries don't scale well.

Is there a more efficient way to achieve this behavior?

Example using percolate:

Demo polygons

PUT geo_demo
{
  "mappings": {
    "properties": {
      "thepoly": {
        "type": "percolator"
      },
      "thepoint": {
        "type": "geo_point"
      }
    }
  }
}

#region 1 (red)
POST /geo_demo/_doc/1
{
  "thepoly": {
    "bool": {
      "must": {
        "match_all": {}
      },
      "filter": {
        "geo_polygon": {
          "thepoint": {
            "points": [
              "-23.573978,-46.664806",
              "-23.583978,-46.664806",
              "-23.583978,-46.658806",
              "-23.573978,-46.658806",
              "-23.573978,-46.664806"
            ]
          }
        }
      }
    }
  }
}

#region 2 (green)
POST /geo_demo/_doc/2
{
  "thepoly": {
    "bool": {
      "must": {
        "match_all": {}
      },
      "filter": {
        "geo_polygon": {
          "thepoint": {
            "points": [
              "-23.579978,-46.664806",
              "-23.583978,-46.664806",
              "-23.583978,-46.652806",
              "-23.579978,-46.652806",
              "-23.579978,-46.664806"
            ]
          }
        }
      }
    }
  }
}

#should match doc/1 only
GET /geo_demo/_search
{
  "query": {
    "percolate": {
      "field": "thepoly",
      "document": {
        "thepoint": "-23.577007,-46.661811"
      }
    }
  }
}

#should match both doc/1 and doc/2
GET /geo_demo/_search
{
  "query": {
    "percolate": {
      "field": "thepoly",
      "document": {
        "thepoint": "-23.582002,-46.661811"
      }
    }
  }
}

#should match doc/2 only
GET /geo_demo/_search
{
  "query": {
    "percolate": {
      "field": "thepoly",
      "document": {
        "thepoint": "-23.582041,-46.655717"
      }
    }
  }
}

#should match none
GET /geo_demo/_search
{
  "query": {
    "percolate": {
      "field": "thepoly",
      "document": {
        "thepoint": "-23.576771,-46.655674"
      }
    }
  }
}
1

There are 1 best solutions below

3
fast tooth On

you almost don't need elasticearch for this, unless you have a strong reason.

For 50K polygon, you can easily hold them in heap, or decompose each polygon into list of geohashes.

you can have a in heap map with geohash as the key, and the polygon id as the value.

as you have point coming in, you first compute the geohash, then use Map#get to check the the point is in the map or which polyogns contains this point.