I'm executing this simple code in a pig script:
REGISTER /home/myuser/mongodb/mongo-2.10.1.jar
REGISTER /opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p0.30/lib/mongo-hadoop-cdh4-1.2.0/mongo-hadoop-core_cdh4.3.0-1.2.0.jar
REGISTER /opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p0.30/lib/mongo-hadoop-cdh4-1.2.0/mongo-hadoop-pig_cdh4.3.0-1.2.0.jar
set mapred.map.tasks.speculative.execution false;
set mapred.reduce.tasks.speculative.execution false;
col = LOAD 'mongodb://localhost:27017/mydb.mycollection' using com.mongodb.hadoop.pig.MongoLoader ('id:chararray, companyId:chararray, ts:chararray', 'id');
STORE col INTO 'mongodb://localhost:27017/mydb.mycollection2' USING com.mongodb.hadoop.pig.MongoInsertStorage ('', '');
it returns the following error:
Location Config: Configuration: For URI: file:/tmp/temp449583595/tmp-109467318 2014-04-04 14:30:40,913 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2017: Internal error creating job configuration. Details at logfile: /home/myuser/pig/pig_1396614639609.log
the end of file pig_1396614639609.log:
... at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.lang.IllegalArgumentException: Invalid URI Format. URIs must begin with a mongodb:// protocol string. at com.mongodb.hadoop.pig.MongoInsertStorage.setStoreLocation(MongoInsertStorage.java:159) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:576)
... 17 more
I don't know where is the error so that mongodb protocol string "mongodb://" is well-written.
I have a similar issue when running LOAD and STORE using mongo-hadoop on the same Pig script.
It throws
I didn't investigate further, but either is a bug or some parameter related to locking. I don't know.
If I run the same code, but loading and storing in different scripts it runs without a problem.