Geomtery type Multipolygon does not Match Column Type Polygon

2.9k Views Asked by At

Using python, sqlalchemy and the psycopg2 driver, I am trying to go from a shapefile to a geodataframe, to a postgres database with postgis installed foloowing this question

I converted the geometry to a WKB hex string, and using df.to_sql() imported the standard data frame successfully.

Upon running the alter table query I'm getting the error:

sqlalchemy.exc.DataError: (psycopg2.errors.InvalidParameterValue) Geometry type (MultiPolygon) does not match column type (Polygon)
3

There are 3 best solutions below

0
On BEST ANSWER

This is because the shapefile will indicate a geometry type that can be either Polygon or Multipolygon for a given row of the resulting geodataframe.

This info is included when converting to Well-Known-Binary hex string and creates a type problem when converting the text to geometry.

The explode function from mhweber's gist will fix that by breaking multipolygons into their componenet parts.

import geopandas as gpd
from shapely.geometry.polygon import Polygon
from shapely.geometry.multipolygon import MultiPolygon

def explode(indata):
    count_mp = 0
    indf = gpd.GeoDataFrame.from_file(indata)
    outdf = gpd.GeoDataFrame(columns=indf.columns)
    for idx, row in indf.iterrows():
        if type(row.geometry) == Polygon:
            outdf = outdf.append(row,ignore_index=True)
        if type(row.geometry) == MultiPolygon:
            count_mp = count_mp + 1
            multdf = gpd.GeoDataFrame(columns=indf.columns)
            recs = len(row.geometry)
            multdf = multdf.append([row]*recs,ignore_index=True)
            for geom in range(recs):
                multdf.loc[geom,'geometry'] = row.geometry[geom]
            outdf = outdf.append(multdf,ignore_index=True)
    print("There were ", count_mp, "Multipolygons found and exploded")
    return outdf

I've added a side effect to print the number of multipolygons found.

Odds are you should investigate these to be sure that the explode function isn't breaking relationships that you need.

0
On

For people still coming back to this post: The explode method is (for considerable time now) part of the GeoPandas api:

gpd.GeoDataFrame.explode()
1
On

Rewrote the function of @Hugh_Kelley to be faster.

def _explode(indf):
    count_mp = 0
    outdf = gpd.GeoDataFrame(columns=indf.columns)
    outdf = indf[indf.geometry.type == 'Polygon']
    indf = indf[indf.geometry.type != 'Polygon']
    for idx, row in indf.iterrows():
        if type(row.geometry) == MultiPolygon:
            count_mp = count_mp + 1
            multdf = gpd.GeoDataFrame(columns=indf.columns)
            recs = len(row.geometry)
            multdf = multdf.append([row]*recs,ignore_index=True)
            for geom in range(recs):
                multdf.loc[geom,'geometry'] = row.geometry[geom]
            outdf = outdf.append(multdf,ignore_index=True)
        else:
            print(row)
    print("There were ", count_mp, "Multipolygons found and exploded")
    return outdf