"lowered", "highway"=>"crossing", "cr" /> "lowered", "highway"=>"crossing", "cr" /> "lowered", "highway"=>"crossing", "cr"/>

Explode hstore into new columns within row

97 Views Asked by At

I have the following table in PostgreSQL. The hstore extension is activated.

osm_id hstore_column
172544 "kerb"=>"lowered", "highway"=>"crossing", "crossing"=>"marked", "crossing_ref"=>"zebra", "tactile_paving"=>"yes", "crossing_island"=>"yes", "crossing_markings"=>"zebra"
172548 "bus"=>"yes", "name"=>"Londoner Straße", "ref_BVG"=>"103010", "website"=>"https_//qr.bvg.de/h103010", "public_transport"=>"stop_position"
458330 "ref"=>"7", "name"=>"Kaiserdamm", "highway"=>"motorway_junction", "destination"=>"Kaiserdamm"

I want to explode the hstore_column into columns for each key.

Table header should turn into

osm_id kerb highway crossing crossing_ref tactile crossing_island crossing_markings bus name ref_BVG website public_transport ref destination
172544 lowered crossing marked zebra yes yes zebra

This related answer requires that keys are known.
The issue is that the set of keys is unknown beforehand.

In this related answer the given function has problems with key words appearing as tags in the hstore column.

Is there a way to dynamically explode hstore keys into single columns within the same table? Only solution I know is in Python, but in-database-processing is more suitable.

1

There are 1 best solutions below

8
Erwin Brandstetter On

The issue is that the set of keys is unknown beforehand.

So there is no solution with a single statement (single round trip to the server). We need a two-step flow. Like:

Step 1: determine the set of keys, and create a temporary type from it dynamically. (There are various other ways to channel the gathered information, this just seems most convenient):

DO
$do$
BEGIN
   DROP TYPE IF EXISTS pg_temp.osm_exploded;
   
   EXECUTE (
   SELECT format('CREATE TYPE pg_temp.osm_exploded AS (%s)'
               , string_agg(quote_ident(key) || ' text', ', '))
   FROM (
      SELECT DISTINCT key
      FROM   tbl, skeys(hstore_column) key
      ORDER  BY key
      ) sub
   );
END   
$do$;

Now the target structure is known, and the rest is easy.

The "trick" with a temporary type is similar to the one with a temporary function. (Bergi's improvement.) See:

Be aware that Postgres allows a maximum of 1600 columns per table row (or less if some other limit is exceeded). The same limit applies to the number of columns in composite types.

Step 2: Simple query with populate_record():

SELECT t.osm_id, o.*
FROM   tbl t, populate_record(null::pg_temp.osm_exploded, t.hstore_column) o;

fiddle

The implicit CROSS JOIN LATERAL is safe because populate_record() always returns a row. (A set-returning function returning "no row" would eliminate the result row, and you might want to use LEFT JOIN ... ON true.)

Related answer with alternative, working solutions, one spelling out all keys in the SELECT list, one using crosstab() from the additional module tablefunc: