Conditionally created array not seen after creation

110 Views Asked by At

Suppose I am writing a type of tac in Ruby that will reverse the lines of a file or stream given to it.

So Line 1\nLine 2\nLine 3\n [Ruby script] => Line 3\nLine 2\nLine1\n

Here are some test files:

printf "f1, Line %s\n" $(seq 3) >f1
printf "f2, Line %s\n" $(seq 5) >f2
printf "f3, Line %s\n" $(seq 7) >f3

A straightforward way to write that is:

ruby -e ' # read each ARGF and reverse it
$<.each_line{|line| 
    lines=Array.new if $<.file.lineno==1 
    lines.unshift(line)
    p lines if $<.eof?
}'

However, with that version, I get the error:

-e:4:in `block in <main>': undefined method `unshift' for nil:NilClass (NoMethodError)

    lines.unshift(line)
         ^^^^^^^^
    from -e:2:in `each_line'
    from -e:2:in `each_line'
    from -e:2:in `<main>'

I can fix that by changing the script to:

ruby -e 'BEGIN{lines=[]}
$<.each_line{|line| 
    lines=Array.new if $<.file.lineno==1 
    lines.unshift(line)
    p lines if $<.eof?
}'

But why is the BEGIN block necessary? Isn't the lines array created with the first go through? It seems like a throw-away definition of the array...


The final version there does work:

cat f1 | ruby -e 'BEGIN{lines=[]}
$<.each_line{|line| 
    lines=Array.new if $<.file.lineno==1 
    lines.unshift(line)
    p lines if $<.eof?
}' - f2 f3
["f1, Line 3\n", "f1, Line 2\n", "f1, Line 1\n"]
["f2, Line 5\n", "f2, Line 4\n", "f2, Line 3\n", "f2, Line 2\n", "f2, Line 1\n"]
["f3, Line 7\n", "f3, Line 6\n", "f3, Line 5\n", "f3, Line 4\n", "f3, Line 3\n", "f3, Line 2\n", "f3, Line 1\n"]

But why do I have to define lines in the BEGIN block only to define it again in the loop? It does not matter what lines is defined as in the BEGIN block; it can be numerical, boolean, hash, whatever -- but the name has to exist.

Ideas?

% ruby -v
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin23]

Thanks for the comments and answers. Please see this Python as I think why my muscle memory may have gotten confused:

def f():
    # local li is created first iteration and used on subsequent... 
    # Similar to Ruby, li is local to this scope of f()
    for x in [1,2,3,4]:
        if x==1: li=[] 
        li.append(x)

    return li 
2

There are 2 best solutions below

3
On BEST ANSWER

First of all, it has nothing to do with Unix pipes or input / output. You get the same error in a Ruby-only variant, e.g.

[1, 2, 3].each do |i|
  ary = [] if i == 1 
  ary.unshift(i)
end
# undefined method `unshift' for nil:NilClass

The exception is raised because ary is defined on the 1st iteration but not on the subsequent iterations – here, ary will be nil. In Ruby, a block creates a new local variable scope and

[...] any local variables created inside it do not leak to the surrounding scope.

This also applies to calling the same block multiple times:

def foo
  yield
  yield
end

foo do
  p before: defined? a
  a = 1
  p after: defined? a
end

Output:

{:before=>nil}
{:after=>"local-variable"}
{:before=>nil}
{:after=>"local-variable"}

As you can see, the variable scope is not retained between block invocations. The same applies to each which also calls the block multiple times.

To get the desired behavior, you can simply create the variable outside the block, e.g.:

ary = []
[1, 2, 3].each do |i|
  ary.unshift(i)
end
ary #=> [3, 2, 1]
1
On

The problem is right in your title [bold italic emphasis mine]:

Conditionally created array not seen after creation

You only create the array conditionally. Which means, there are conditions where the array isn't created. To be precise, the array is only created if $<.file.lineno equals 1. In every other case, the array is not created.

You have to make sure the array is created in any case.

The simplest way to do that is to move the assignment out of the block:

lines = []

$<.each_line { |line| 
  lines.unshift(line)
  p lines if $<.eof?
}

You can do the same thing with the printing as well:

lines = []

$<.each_line { |line| 
  lines.unshift(line)
}

p lines

This now allows you to rewrite the loop in point-free style:

lines = []

$<.each_line(&lines.method(:unshift)

p lines

However, I don't see the point in lazily iterating over the input if you are going to end up building the whole array in memory anyway. Why not just do something like this instead:

p $<.readlines.reverse