Combiner function in Apache Hadoop with Gora

69 Views Asked by At

I have a simple Hadoop, Nutch 2.x, Hbase cluster. I have to write a MR job that will find some statistics. It is two step job i.e., I think I need combiner function also. In simple Hadoop jobs, its not a big problem as a lot of guide is given e.g., this one. But I could not found any option to use combiner with Gora. My statistics will be added to pages in Hbase that's why I could not about Gora (I think ). Following is the code snippet where I expect to add com

GoraMapper.initMapperJob(job, query, pageStore, Text.class, WebPage.class,
        My_Mapper.class, null, true);


    job.setNumReduceTasks(1);

    // === Reduce ===
    DataStore<String, WebPage> hostStore = StorageUtils.createWebStore(
        job.getConfiguration(), String.class, WebPage.class);
    GoraReducer.initReducerJob(job, hostStore, My_Reducer.class);
1

There are 1 best solutions below

1
On

I have never used the combiner with Gora, but does this work (or what error does it show)?:

GoraReducer.initReducerJob(job, hostStore, My_Reducer.class);
job.setCombinerClass(My_Reducer.class);

Edit: Created an issue at Apache's Jira about the Combiner.