gather different keys to the same reducer function - HADOOP

340 Views Asked by At

I want to gather to the same reducer function all the values of the keys which have at least one integer in common. For example all the values that ​​correspond to the key "1,2" and all the values that ​​correspond to the key "2,3" must be always in the same reducer function because these two keys have the integer 2 in common.

In another way, I just want to change the "key equality condition" to another condition.

Is there a way to do this? Is it relevant with the Partitioner class or I have to do something completely different?

I use 1.2.1 hadoop version if this matters.

Thanks in advance!

1

There are 1 best solutions below

0
On

I have only one Reducer function per job, I agree with that. However, when I run hadoop as a simulation in NetBeans (not in distributed mode) it creates one reducer task for each unique key. For instance, If I have only 3 keys (k1,k2,k3) it will call the reduce function 3 times, one for each of these keys.

example:
Reducer: key=k1
values which correspond to k1
Reducer: key=k2
values which correspond to k2
Reducer: key=k3
values which correspond to k3

Therefore, the values which correspond to key k1 , can be accessed only from that reducer's task and the same happens for k2 and k3 values. What I want to do is to gather k1 and k2 to the same task(assuming that these two keys have something in common) so that I can access all these values (which correspond to k1 and k2 key) from only one reducer task.

In addition, I read this example and I thought that I understood it until I run it and I saw that it creates 2 reducer tasks again and not 3 which is the number of the age groups in the partitioner.

output example:
Reducer: female
Monica<tab>56<tab>92
Kristine<tab>38<tab>53
Alice<tab>23<tab>45
Nancy<tab>7<tab>98
Mary<tab>6<tab>93
Clara<tab>87<tab>72
Reducer: male
James<tab>34<tab>79
Jacob<tab>7<tab>23
Alex<tab>52<tab>69
Bob<tab>34<tab>89
Chris<tab>67<tab>97
Adam<tab>9<tab>37
Connor<tab>25<tab>27
Daniel<tab>78<tab>95