how can i use job.addCacheArchive to load cache file from the output of another mapreduce?

16 Views Asked by At

Im trying to load cache file from the output of another mapreduce by java while writing mapreduce

I tried

FileOutputFormat.setOutputPath(job1, new Path(out, "out1"));
...
Job job2 = Job.getInstance(conf, "Second");
job2.addCacheArchive(new URI(new Path(out, "out1").toString()));

and I set reducer by

protected void setup(Context context) throws IOException, InterruptedException {
            super.setup(context);
            logger.info("log start");
            if (context.getCacheFiles() != null
                    && context.getCacheFiles().length > 0) {

                URI[] cacheArchives = context.getCacheFiles();
.......
}

but it doesnt work for me, I guess there are something wrong with the new URI(new Path(out, "out1").toString()) part so I tried use log to get some information but I cant get log information from main since it doesnt belong to either of these two jobs... I'm totally lost.....

0

There are 0 best solutions below