Method not implemented exception on Take method in Microsoft.Spark

151 Views Asked by At

I am trying to setup spark with the new Microsoft.Spark library. The method DataFrame.PrintSchema works fine, however the method DataFrame.Take() gives an System.NotImplementedException. Allot of other methods also give this exception.

I took a look in the sources and that the 'Take' method calls the collect method and and it fails on the call to collectToPython.

SparkSession spark = SparkSession
    .Builder()
    .AppName(".NET Spark")
    .GetOrCreate();

DataFrame dataFrame = spark.Read().Json("people.json");
IEnumerable<Row> rows =  dataFrame.Take(1);

Is this just a Microsoft library that isn't finished yet? Or am I doing something wrong?

1

There are 1 best solutions below

0
imback82 On

Did you try the latest version released? I used v0.2.0 and the following works fine as expected:

var spark = SparkSession.Builder().GetOrCreate();
var df = spark.Read().Json("people.json");

IEnumerable<Row> rows = df.Take(1);
foreach (var row in rows)
{
    Console.WriteLine(row.Get("name"));
}
spark.Stop();