I have this very basic test (immediately after installation of both hadoop 2.7 and pig 0.14)
the file exists in hdfs -
hdfs://master:50070/user/raghav/family<r 2> 32
hdfs://master:50070/user/raghav/nsedata <dir>
however, when i run the following,
A = LOAD 'family';
dump A;
i get the following error message -
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.7.0 0.14.0 raghav 2015-05-19 21:38:35 2015-05-19 21:38:41 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_1432066972596_0002 A MAP_ONLY Message: Job failed! hdfs://master:50070/tmp/temp-1977333348/tmp-1065056833,
Input(s):
Failed to read data from "hdfs://master:50070/user/raghav/family"
Output(s):
Failed to produce result in "hdfs://master:50070/tmp/temp-1977333348/tmp-1065056833"
Further investigations reveal a bit more.. As indicated, I can see the file on hdfs (from within pig through ls command) and also from shell prompt using hadoop fs commands. however, neither pig nor hive are able to see the files on hdfs.
I also tried to play around with the nematode ports (tried different values 8020, 9000, 50070) but the bahaviour remains same. I tried looking through nematode and datanode logs too, but couldn't find anything more...
serious help required !!!
Answers to some questions
myhost raghav$ hdfs dfs -ls /user/raghav/family
15/05/20 08:03:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r-- 2 raghav supergroup 32 2015-05-15 01:01 /user/raghav/family
myhost raghav$ hdfs dfs -ls /user/raghav/
15/05/20 08:04:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
-rw-r--r-- 2 raghav supergroup 32 2015-05-15 01:01 /user/raghav/family
drwxr-xr-x - raghav supergroup 0 2015-05-15 00:25 /user/raghav/nsedata
myhost raghav$ hadoop fs -ls /
15/05/20 08:04:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 2 items
drwxr-xr-x - raghav supergroup 0 2015-05-19 23:06 /tmp
drwxr-xr-x - raghav supergroup 0 2015-05-20 07:30 /user
myhost raghav$
Further tests reveal that hive is able to use hdfs, but pig still can't. I could create an external table in hive, successfully pointing to the example file 'family'
create external table xfamily(name STRING, age INT)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
> STORED AS TEXTFILE
> LOCATION '/user/raghav';
OK
Time taken: 0.023 seconds
hive> select * from xfamily;
xxxxxx - expected data shows up.