I am running some code that calculates a sequence of records and calls Frame.ofRecords
with that sequence as its argument. The records are calculated using PSeq.map
from the library FSharp.Collections.ParallelSeq
.
If I convert the sequence into a list then the output is OK. Here is the code and the output:
let summaryReport path (writeOpenPolicy: WriteOpenPolicy) (outputs: Output seq) =
let foo (output: Output) =
let temp =
{ Name = output.Name
Strategy = string output.Strategy
SharpeRatio = (fst output.PandLStats).SharpeRatio
CalmarRatio = (fst output.PandLStats).CalmarRatio }
printfn "************************************* %A" temp
temp
outputs
|> Seq.map foo
|> List.ofSeq // this is the line that makes a difference
|> Frame.ofRecords
|> frameToCsv path writeOpenPolicy ["Name"] "Summary_Statistics"
Name Name Strategy SharpeRatio CalmarRatio
0 Singleton_AAPL MyStrategy 0.317372564 0.103940018
1 Singleton_MSFT MyStrategy 0.372516931 0.130150478
2 Singleton_IBM MyStrategy Infinity
The printfn
command let me verify by inspection that in each case the variable temp
was calculated correctly.
The last code line is just a wrapper around FrameExtensions.SaveCsv
.
If I remove the |> List.ofSeq
line then what comes out is garbled:
Name Name Strategy SharpeRatio CalmarRatio
0 Singleton_IBM MyStrategy 0.317372564 0.130150478
1 Singleton_MSFT MyStrategy 0.103940018
2 Singleton_AAPL MyStrategy 0.372516931 Infinity
Notice that the empty (corresponding to NaN
) and Infinity
items are now in different lines and other things are also mixed up.
Why is this happening?
The
Frame.ofRecords
function iterates over the sequence multiple times, so if your sequence returns different data when called repeatedly, you will get inconsistent data into the frame.Here is a minimal example:
This returns:
As you can see, the first item is obtained during the first iteration of the sequence, while the second items is obtained during the second iteration.
This should probably be better documented, but it makes the performance better in typical scenarios - if you can send a PR to the docs, that would be useful.