I currently have a mapReduce program that send data to hdfs with different file name.So in my reducer I am using MultipleOutputs to write to different files in HDFS (Full Reducer code below).
I would like to test my code using mrunit and below is my test method.
@Test
public void reducerMRUnit() throws IOException{
String output="";
ArrayList<Text> list = new ArrayList<Text>(0);
list.add(new Text(""));
reduceDriver.withInput(new Text(""), list);
reduceDriver.withPathOutput(new Text(output),NullWritable.get(),"");
reduceDriver.runTest();
}
But, when I run this test it giving me NPE.
java.lang.NullPointerException
at org.apache.hadoop.fs.Path.<init>(Path.java:104)
at org.apache.hadoop.fs.Path.<init>(Path.java:93)
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.getDefaultWorkFile(FileOutputFormat.java:286)
at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:129)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:476)
at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:456)
at org.clinical3PO.learn.fasta.ArffToFastAReducer.reduce(ArffToFastAReducer.java:127)
at org.clinical3PO.learn.fasta.ArffToFastAReducer.reduce(ArffToFastAReducer.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mrunit.mapreduce.ReduceDriver.run(ReduceDriver.java:265)
at org.apache.hadoop.mrunit.TestDriver.runTest(TestDriver.java:640)
at org.apache.hadoop.mrunit.TestDriver.runTest(TestDriver.java:627)
at org.clinical3PO.learn.fasta.MRUnitTest.ArffToFastAReducerMRUnitTest.reducerMRUnit(ArffToFastAReducerMRUnitTest.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:675)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
Reducer code:
import java.io.IOException;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.output.MultipleOutputs;
public class AReducer extends Reducer<Text, Text, Text, NullWritable>{
private MultipleOutputs<Text, NullWritable> mos = null;
@Override
public void setup(Context context) throws IOException {
mos = new MultipleOutputs<Text, NullWritable>(context);
}
@Override
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
mos = new MultipleOutputs<Text, NullWritable>(context);
mos.write(key, value, "filename");
}
@Override
public void cleanup(Context context) throws IOException, InterruptedException {
mos.close();
}
}
Any Suggestions?
MRUnit currently has a known issue, which is not well documented, that testing
MultipleOutputsrequires running the test withPowerMockRunnerand aPrepareForTestannotation applied to mock the reducer class. JIRA issues MRUNIT-13 and MRUNIT-213 contain detailed discussion of this. MRUNIT-213 is still unresolved/unfixed.Adding PowerMock to the project then triggers some further challenges in lining up the right compatible versions of Mockito and PowerMock. The documentation on Using PowerMock with Mockito covers which versions are compatible.
I tried making these changes to your sample. That got past the
NullPointerException, but then I ran into one final problem. The expected path output declared in the test did not match up with the"filename"path used by the reducer code. I changed the expected path output to get the test completely passing.Here is my final result: a fully working project with your sample test. Enjoy!
pom.xml
src/main/java/AReducer.java
src/test/java/TestAReducer.java