Ruby NArray optimisations for downsample and conditional change

179 Views Asked by At

I am creating input to ruby-fann, doing as much manipulation in narray as I can for performance reasons. Typically I am manipulating 2D 200x200 arrays of floats, and need to repeat processing many 1000s of times.

Using just NArray, I get acceptable performance. However, I have hit a couple of manipulations that I'd like to do where I cannot get NArray to do things in bulk - as far as I can see. Which means I end up using Ruby's loop controls to work with individual NArray entries. This has an immediate and detrimental effect on performance of my code, and I wondered what work-arounds or approaches I have available. I am capable of, but do not wish to, fork NArray and add some features for my work. That does not sit well with me, because the features I need are not generic enough to go into that library.

I might consider writing a native extension that uses NArray directly somehow - pointers on how to do that would be welcome, I'm not sure how to reference one natively-extended gem from another.

I would also appreciate any insight or feedback into how I might structure the code differently, to make either Ruby part faster, or to take advantage of any NArray feature or related library.

My two slow bits of code are very similar, hence posting single question.

1) Limit matrix of floats to a range

What I am currently doing (simplified):

  # In reality, nn_input contains real-world data, and I want to 
  # re-normalise it, clipping high values to a maximum of 1.0
  nn_input = NArray.float(200,200).random
  nn_input *= 1.1 
  # The test for "anything needs clipping" is fast, the 200x200 loop is somewhat slower!
  if (nn_input.gt 1.0).sum > 0
    (0...200).each do |x|
      (0...200).each do |y|
        nn_input[x, y] = 1.0 if nn_input[x, y] > 1.0
      end
    end
  end

2) Down-sample a large matrix to smaller one based on mean values (think "image re-size")

What I am currently doing (simplified):

  # In reality, nn_input contains real-world data, and I want to 
  # downsize it, re-sampling a 200x200 array to a 20x20 one
  large_input = NArray.float(200,200).random
  small_output = NArray.float(20,20) 

 (0...20).each do |x|
   (0...20).each do |y|
     small_output[x, y] = large_input[x*10..x*10+9,y*10..y*10+9].mean
   end
 end

I am using NArray's mean method in the second example, and it's less of an issue than the first example, where I end up executing a small Ruby loop 40000 times for each item (thus over 200 million times for the entire data set!)


Following reply by masa16, here is a very quick irb benchmark showing the difference in speed:

irb
1.9.3-p327 :001 > require 'narray'
 => true
1.9.3-p327 :002 > t0 = Time.now; 250.times { nn_input = NArray.float(200,200).random() * 1.1; (0...200).each {|x| (0...200).each { |y| nn_input[x,y]=1.0 if nn_input[x,y]> 1.0 }} }; Time.now - t0
 => 9.329647
1.9.3-p327 :003 > t0 = Time.now; 250.times { nn_input = NArray.float(200,200).random() * 1.1; nn_input[nn_input.gt 1.0] = 1.0; }; Time.now - t0
 => 0.764973

So for that small code segment that's 10 times faster, and as I'm typically running not 250 times, but 50,000 times, has saved me somewhere between 30 minutes and an hour run time on something that was taking 3 to 4 hours before.

2

There are 2 best solutions below

1
On BEST ANSWER

1) nn_input[nn_input.gt 1.0] = 1.0

2) small_output = large_input.reshape(10,20,10,20).mean(0,2)

1
On

Have you benchmarked your code to establish if there are any bottlenecks?
Is there any way you can parallelise the calculations to make use of a multi-core environment? If so, have you considered using JRuby in order to leverage the excellent JVM multithreading support?

These are a few of the things I can think of off the top of my head. It could simply be that you are dealing with so much data that you need to find an alternate way of solving this.