Not able to read hdfs files through pig on pseudo node cluster

338 Views Asked by At

I have this very basic test (immediately after installation of both hadoop 2.7 and pig 0.14)

the file exists in hdfs -

hdfs://master:50070/user/raghav/family<r 2> 32
hdfs://master:50070/user/raghav/nsedata <dir>

however, when i run the following,

A = LOAD 'family';
dump A;

i get the following error message -

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
2.7.0   0.14.0  raghav  2015-05-19 21:38:35 2015-05-19 21:38:41 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_1432066972596_0002  A   MAP_ONLY    Message: Job failed!    hdfs://master:50070/tmp/temp-1977333348/tmp-1065056833,

Input(s):
Failed to read data from "hdfs://master:50070/user/raghav/family"

Output(s):
Failed to produce result in "hdfs://master:50070/tmp/temp-1977333348/tmp-1065056833"

Further investigations reveal a bit more.. As indicated, I can see the file on hdfs (from within pig through ls command) and also from shell prompt using hadoop fs commands. however, neither pig nor hive are able to see the files on hdfs.

I also tried to play around with the nematode ports (tried different values 8020, 9000, 50070) but the bahaviour remains same. I tried looking through nematode and datanode logs too, but couldn't find anything more...

serious help required !!!

Answers to some questions

myhost raghav$ hdfs dfs -ls /user/raghav/family
15/05/20 08:03:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r--   2 raghav supergroup         32 2015-05-15 01:01 /user/raghav/family
myhost raghav$ hdfs dfs -ls /user/raghav/
15/05/20 08:04:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r--   2 raghav supergroup         32 2015-05-15 01:01 /user/raghav/family
drwxr-xr-x   - raghav supergroup          0 2015-05-15 00:25 /user/raghav/nsedata
myhost raghav$ hadoop fs -ls /
15/05/20 08:04:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
drwxr-xr-x   - raghav supergroup          0 2015-05-19 23:06 /tmp
drwxr-xr-x   - raghav supergroup          0 2015-05-20 07:30 /user
myhost raghav$ 

Further tests reveal that hive is able to use hdfs, but pig still can't. I could create an external table in hive, successfully pointing to the example file 'family'

create external table xfamily(name STRING, age INT)
    > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
    > STORED AS TEXTFILE
    > LOCATION '/user/raghav';
OK
Time taken: 0.023 seconds
hive> select * from xfamily;
xxxxxx - expected data shows up.
0

There are 0 best solutions below