I'm using perl+R to analyze a large dataset of samples. For each two samples, I calculate the t-test p-value. Currently, I'm using the statistics::R module to export values from perl to R, and then use the t.test function. However, this process is extremely slow. I was wondering if someone knows a perl function that will do the same procedure, in a more efficient manner.
Thanks!
The volume of data, the number of dataset pairs, and perhaps even the code you have written would probably help us identify why your code is slow. For instance, sending many small datasets to R would be slow, but can probably be sped up simply by sending all the data at once.
For a pure Perl solution, you first need to compute the test statistic (that is easy, and already done in
Statistics::TTest, for instance), and then to convert it to a p-value (you need something like R'sqtfunction, but I am not sure it is readily available in Perl -- you could send the T-values to R, in one block, at the end, to convert them to p-values).