HBase bulk load (using configureIncrementalLoad helper method) configures the job to create as many reducer task as the regions in the hbase table. So if there are few hundred regions then the job would spawn few hundred reducer tasks. This could get very slow on a small cluster..
Is there any workaround possible by using MultipleOutputFormat or something else?
Thanks
When we use HFileOutputFormat, its overrides number of reducers whatever you set. The number of reducers is equal to number of regions in that HBase table. So decrease the number of regions if you want to control the number of reducers.
You will find a sample code here:
Hope this will be useful :)