Is there a way to COPY the CSV file data directly into a JSON or JSONb array?
Example:
CREATE TABLE mytable (
id serial PRIMARY KEY,
info jSONb -- or JSON
);
COPY mytable(info) FROM '/tmp/myfile.csv' HEADER csv;
NOTE: each CSV line is mapped to a JSON array. It is a normal CSV.
Normal CSV (no JSON-embeded)... /tmp/myfile.csv
=
a,b,c
100,Mum,Dad
200,Hello,Bye
The correct COPY command must be equivalent to the usual copy bellow.
Usual COPY (ugly but works fine)
CREATE TEMPORARY TABLE temp1 (
a int, b text, c text
);
COPY temp1(a,b,c) FROM '/tmp/myfile.csv' HEADER csv;
INSERT INTO mytable(info) SELECT json_build_array(a,b,c) FROM temp1;
It is ugly because:
need the a priory knowledge about fields, and a previous
CREATE TABLE
with it.for "big data" need a big temporary table, so lost CPU, disk and my time — the table
mytable
have CHECKs and UNIQUEs constraints for each line.... Needs more than 1 SQL command.
Perfect solution!
Not need to know all the CSV columns, only extract what you know.
Use at SQL
CREATE EXTENSION PLpythonU;
: if the command produce an error like "could not open extension control file ... No such file" you need to install pg-py extra-packages. In standard UBUNTU (16 LTS) is simple,apt install postgresql-contrib postgresql-plpython
.The split_csv() function was defined here. The
csv.reader
is very reliable (!).Not tested for big-big CSV... But expected Python do job.
PostgreSQL workaround
It is not a perfect solution, but it solves the main problem, that is the
This is the way we do it, a workaround with
file_fdw
!Adopt your conventions to avoid file-copy and file-permission confusions... The standard file path for a CSV. Example:
/tmp/pg_myPrj_file.csv
Initialise your database or SQL script with the magic extension,
For each CSV file,
myNewData.csv
,3.1. make a symbolic link (or
scp
remote copy) for your new fileln -sf $PWD/myNewData.csv /tmp/pg_socKer_file.csv
3.2. configure the file_fdw for your new table (suppose
mytable
).PS: after running SQL script with
psql
, when having some permission problem, change owner of the link bysudo chown -h postgres:postgres /tmp/pg_socKer_file.csv
.3.3. use the file_fdw table as source (suppose populating
mytable
).Thanks to @JosMac (and his tutorial)!
NOTE: if there is a STDIN way to do it (exists??), will be easy, avoiding permission problems and use of absolute paths. See this answer/discussion.