I'm trying to unload data from Snowflake to GCS and then GCS to bq table. Here is the Code for unloading data from snowflakes.

```copy into @unload_stage/FF_TBL_UNLOAD20200906.csv.gz from 
(
select * 
from SF_DB.SF_TBLS.FF_TBL_UNLOAD
)
file_format = (
  type=csv compression='gzip' 
  FIELD_DELIMITER = '|' 
  field_optionally_enclosed_by='"' 
  NULL_IF=()
  EMPTY_FIELD_AS_NULL = TRUE 
)
single = false
max_file_size=5300000000 
header = false;```

And then I'm copying data from GCS to bq using the following script

```#!/bin/bash
date=20200906

echo "Removing FF_TBL_UNLOAD list with same date list...."

rm /home/varma/FF_TBL_UNLOADlist"$date".txt

echo "Listing FIlenames for FF_TBL_UNLOAD in GCS BUCKET...."

gsutil ls gs://syw_hs_inbound_outbound_data/FF_TBL_UNLOAD"$date"*.gz>>/home/varma/FF_TBL_UNLOADlist"$date".txt

echo "Starting Uploading  Data into table from FF_TBL_UNLOAD$date list..."
if [ -s /home/varma/FF_TBL_UNLOADlist"$date".txt ]
then
        while IFS= read -r line
        do
                echo "Uploading data for file $line"
        bq load --noreplace --field_delimiter="|" hty-ttw-analysis:out.FF_TBL_UNLOAD $line
        done < "/home/varma/FF_TBL_UNLOADlist${date}.txt"

else
        echo "File is Empty"

fi```

It worked for all the tables except this table, error I'm getting is Error while reading data, error message: CSV table references column position 174, but line starting at position:136868 contains only 94 columns.

Could anyone please help me with this error, Should I change file format or Should I make some changes in the script which I'm uploading to bq

2

There are 2 best solutions below

1
On

Based on the error message I would say that the issue is on the command line:

bq load --noreplace --field_delimiter="|" hty-ttw-analysis:out.FF_TBL_UNLOAD $line

According to this and also this other question this error message is displayed when there is an issue with the data being loaded, that BigQuery can't parse properly or map to the table you want to populate. So the solution seems to be that you add a validation/cleaning step for your CSV data before running this command, so it doesn't break your BigQuery load.

To know better what could be the issue with the data, a sample data that can reproduce the error message would help a lot, as other members of the community point out.

1
On

As you have said in the comment, using below command solved the issue:

bq load --noreplace --field_delimiter="|" --null_marker=NULL --allow_quoted_newlines=TRUE --allow_jagged_rows=TRUE