How to unit test dotnet spark df without installing spark

82 Views Asked by At

I have a simple dotnet spark app and I have tried to break it down into units for testing. A sample unit,

public DataFrame filtermyname(DataFrame df, string name) 
{
   return df.Filter(“name”==name);
}

Since unit test should not have external dependencies, my organisation is not allowing installing spark in the build servers. Is there a way to test this without installing spark by mocking session?

1

There are 1 best solutions below

0
Morten Bork On

I am not 100% sure I am fully understanding you, or the complications of you architecture.

But I would assume, that the action you have on:

df.Filter(“name”==name);

You replace with:

public interface IFilterSource {
   IFilterSource FilterByText(string filterText);
}

Then implement IFilterSource on the DataFrame class? Or make an implementation of IFilterSource that has DataFrame as a property, and then apply the filter on that property.

so that your method becomes:

public IFilterSource filtermyname(IFilterSource source, string name) 
{
   return source.FilterByText(name);
}
 

Now you can mock the IFilterSource, and use a concrete instance for a DataFrame.