Deedle IndexRows type annotations

206 Views Asked by At

I was trying to implement a Deedle solution for the little challenge from @migueldeicaza to achieve in F# what was done in http://t.co/4YFXk8PQaU with python and R. The csv source data is available from the link.

The start is simple but now, while trying to order based upon a column series of float values I'm struggling to understand the syntax for the IndexRows type annotation.

#I "../packages/FSharp.Charting.0.90.5"
#I "../packages/Deedle.0.9.12"
#load "FSharp.Charting.fsx"
#load "Deedle.fsx"

open System
open Deedle
open FSharp.Charting

let bodyCountData = Frame.ReadCsv(__SOURCE_DIRECTORY__ + "/film_death_counts.csv")
bodyCountData?DeathsPerMinute <- bodyCountData?Body_Count / bodyCountData?Length_Minutes

// select top 3 rows based upon default ordinal indexer
bodyCountData.Rows.[0..3]

// create a new frame indexed and ordered by descending number of screen deaths per minute
let bodyCountDataOrdered =
    bodyCountData
    |> Frame.indexRows <float>"DeathsPerMinute" // uh oh error here - I'm confused

And because I can't figure that syntax out... various messages like:

Error   1   The type '('a -> Frame<'c,Frame<int,string>>)' does not support the 'comparison' constraint. For example, it does not support the 'System.IComparable' interface. See also c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx(18,4)-(19,22).    c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx  19  8   RPythonFSharpDFChallenge
Error   2   Type mismatch. Expecting a
    'a -> Frame<'c,Frame<int,string>>    
but given a
    'a -> float    
The type 'Frame<'a,Frame<int,string>>' does not match the type 'float'  c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx  19  25  RPythonFSharpDFChallenge
Error   3   This expression was expected to have type
    bool    
but here has type
    string  c:\wd\RPythonFSharpDFChallenge\RPythonFSharpDFChallenge\EvilMovieQuery.fsx  19  31  RPythonFSharpDFChallenge

Edit: Just thinking about this... indexing on a measured float is a silly thing to do anyway - duplicates and missing values in real world data. So, I wonder what a more sensible approach to this would be. I still need to find the 25 max values... Maybe I can work this out for myself...

1

There are 1 best solutions below

0
On

With Deedle 1.0, you can sort on an arbitrary column.

See: http://bluemountaincapital.github.io/Deedle/reference/deedle-framemodule.html#section7