I have a perfectly good perl subroutine written as part of a perl module. Without going into too many details, it takes a string and a short list as arguments (often taken from terminal) and spits out a value (right now, always a floating point, but this may not always be the case.)
Right now, the list portion of my argument takes two values, say (val1,val2). I save the output of my subroutine for hundreds of different values for val1 and val2 using for loops. Each iteration takes almost a second to complete--so completing this entire process takes hours.
I recently read of a mystical (to me) computational tool called "threading" that apparently can replace for loops with blazing fast execution time. I have been having trouble understanding what these are and do, but I imagine they have something to do with parallel computing (and I would like to have my module as optimized as possible for parallel processors.)
If I save all the values I would like to pass to val1 as a list, say @val1 and the same for val2, how can I use these "threads" to execute my subroutine for every combination of the elements of val1 and val2? Also, it would be helpful to know how to generalize this procedure to a subroutine that also takes val3, val4, etc.
Update:
I do not use PDL so I did not know a thread in PDL does not correspond exactly to the notion of threading I have been talking about. See PDL threading and signatures:
However, I think the explanation below is still useful to you as one would need to know what threading in the regular sense is to understand how PDL threads are different.
Here is the Threads entry on Wikipedia for background.
Using threads cannot make your program magically faster. If you have multiple CPUs/cores and if the computations you are carrying out can be divided into independent chunks, using threads can allow your program to carry more than one computation at a time and cut down on the total execution time.
The easiest case is when the subtasks are embarrassingly parallel requiring no communication/coordination between threads.
Regarding possible performance gains, consider the following program:
On my dual core laptop running Windows XP:
Now, compare that to: