I am using Ruby's StringScanner to normalize some English text.
def normalize text
  s = ''
  ss = StringScanner.new text
  while ! ss.eos? do
    s += ' ' if ss.scan(/\s+/)             # mutiple whitespace => single space
    s += 'mice' if ss.scan(/\bmouses\b/)   # mouses => mice
    s += '' if ss.scan(/\bthe\b/)          # remove 'the'
    s += "#$1 #$2" if ss.scan(/(\d)(\w+)/) # should split 3blind => 3 blind
  end
  s
end
normalize("3blind the   mouses")  #=> should return "3 blind mice"
Instead I am just getting "  mice".
StringScanner#scan is not capturing the (\d) and (\w+).
 
                        
To access a StringScanner captured (in Ruby 1.9 and above), you use
StringScanner#[]:In Ruby 2.1, you should be able to capture by name (See Peter Alfvin's link)