TigerGraph Data Load using HEADER and INPUT_LINE_FILTER

328 Views Asked by At

I am trying to create a TigerGraph loading job to use column names from the header line instead of positional parameters like $1, $2 etc. In order to use files external to TigerGraph server, I believe we need to specify the HEADER and INPUT_LINE_FILTER and use them in the loading job. However, I am unable to find any examples in TigerGraph documentation. My attempt below is now working.

Any help is appreciated.

result = conn.gsql('''
USE GRAPH MyGraphCopy 

BEGIN
CREATE LOADING JOB load_persons_byname FOR GRAPH MyGraphCopy {
      DEFINE FILENAME MyDataSource;
      DEFINE HEADER MyHeader = "id","name","email","username","created_at","v_id";
      DEFINE INPUT_LINE_FILTER reject_header = ( "created_at " = "created_at");
      LOAD MyDataSource TO VERTEX Person VALUES($"v_id", $"name", $"email", $"username", $"created_at") USING SEPARATOR=",", HEADER="true", USER_DEFINED_HEADER=MyHeader, REJECT_LINE_RULE=reject_header, EOL="\n", QUOTE="double";
    }
END
''')
2

There are 2 best solutions below

1
On

Try removing the DEFINE HEADER and DEFINE INPUT_LINE_FILTER lines. If you are reading from a CSV, then simply using header="true"; that should be enough to get the references for the column names without doing anything else.

Also, make sure you specify the name and location of the file you are trying to use as MyDataSource. Here is an example, where the filename is file1:

CREATE LOADING JOB load_social FOR GRAPH social {
   DEFINE FILENAME file1="/home/tigergraph/personAndFriendship.csv";

   LOAD file1 TO VERTEX person VALUES ($"name", $"name", $"age", $"gender", $"state") WHERE NOT ($"age" IS EMPTY) USING header="true", separator=",";
   LOAD file1 TO EDGE friendship VALUES ($0, $1, $5) USING header="true", separator=",";
}

Also, see the official TG loading job documentation.

0
On

This is how I was able to get it to work.

USE GRAPH MyGraph

BEGIN
CREATE LOADING JOB load_persons_byname FOR GRAPH MyGraphCopy {
      DEFINE FILENAME MyDataSource;
      DEFINE HEADER MyHeader = "id","name","email","username","created_at","v_id";
      DEFINE INPUT_LINE_FILTER reject_header = ( $4 == "created_at");
      LOAD MyDataSource TO VERTEX Person VALUES($"v_id", $"name", $"email", $"username", $"created_at") USING SEPARATOR=",", HEADER="true", USER_DEFINED_HEADER="MyHeader", REJECT_LINE_RULE=reject_header, EOL="\n", QUOTE="double";
    }
END