Hadoop MultipleOutputFormat.generateFileNameForKeyValue with many keys

743 Views Asked by At

I am trying to play with MultipleOutputFormat.generateFileNameForKeyValue() .

The idea is to create directory for each of my keys.

This is the code:

static class MyMultipleTextOutputFormat extends MultipleTextOutputFormat<Text, Text> {
    @Override
    protected String generateFileNameForKeyValue(Text key, Text value, String name) {
        arr = key.toString().split("_");
        return arr[0]+"/"+name;
    }

}

This code works only if the emitted records are few. If i run the code against my real input, it just hangs on reducer around 70%.

What might be the problem here - working on small number of keys, not working on many .

0

There are 0 best solutions below