IBM BigSheets Issue

159 Views Asked by At

I am getting some error in loading my files onto big sheets both directly from the HDFS( files that are output of pig scripts) and also raw data that is lying on the local hard disk. I have observed that whenever I am loading the files and issuing a row count to see if all data is loaded into bigsheets, then I see lesses number of rows being loaded. I have checked that the files are consistent and proper delimeters(/t or comma separated fields). Size of my file is around 2GB and I have used either of the format *.csv/ *.tsv.

Also in some cases when i have tired to load a file from windows os directly then the files sometimes load successfully with row count matching with actual number of lines in the data, and then sometimes with lesser number of rowcount.

Even sometimes when a fresh file being used 1st time it gives the correct result but if I do the same operation next time some rows are missing.

Kindly share your experience your bigsheets, solution to any such problems where the entire data is not being loaded etc. Thanks in advance

1

There are 1 best solutions below

1
On

The data that you originally load into BigSheets is only a subset. You have to run the sheet to get it on the full dataset.

http://www-01.ibm.com/support/knowledgecenter/SSPT3X_3.0.0/com.ibm.swg.im.infosphere.biginsights.analyze.doc/doc/t0057547.html?lang=en