Handling multiple possible return types in recursive function

497 Views Asked by At

I am using Racket ISL+ to write a recursion that is counting a series of structures. If the structures fails some parameters, I want to return a value of #false. However, at one point during recursion, I know the computer is getting 1 + 1 + 1 + 1 + 1 + 1 + #false, which is giving me an error.

Is there a way for me to state that if an error is found to just return a value of #false from the original call?

1

There are 1 best solutions below

0
ad absurdum On

When a value is returned from some procedure deep in a recursive process, the procedure that receives that value must be able to handle it. If a recursive procedure can return either of two distinct types, e.g., a boolean or a number, then that procedure will need to test the returned value before processing it with a procedure that requires a single type.

Using A Continuation

It is possible to get a continuation and then to use that to escape a recursive process. This is probably not the solution OP is looking for; I don't believe that the HTDP Student Languages provide a means of getting continuations. In plain Racket you could use call/cc to grab a continuation:

(define (count-numbers-or-f xs)
  (call/cc
   (lambda (break)
     (define (loop xs)
       (cond ((null? xs) 0)
             ((number? (car xs))
              (+ 1 (loop (cdr xs))))
             (else (break #f))))
     (loop xs))))

This is not the place to develop the details of continuations, but briefly, call/cc arranges it so that break in the above code is an escape procedure that, when called, returns its argument (#f here) to the continuation of the computation with respect to the call to call/cc. Here, when the first element of the input is a number?, 1 is added to the result of counting the rest of the input; but, when the first element of the input is not a number? (and the input is not an empty list) the break procedure is called. When break is invoked in the recursive process described by loop control jumps to that continuation, which is after the call to call/cc, thereby escaping the recursive process; the value #f is given to the continuation, so that #f is then returned by count-numbers-or-f.

Doing Without Continuations

Continuations are a classic way to achieve this sort of a non-local exit from a loop, but you can get a similar result, with some careful design, using less exotic means:

(define (count-numbers-or-f-1 xs)
  (cond ((null? xs) 0)
        ((not (number? (car xs))) #f)
        (else
         (let ((r (count-numbers-or-f-1 (cdr xs))))
           (if (number? r)
               (+ 1 r)
               #f)))))

Here, if the car of xs is not a number (and xs is not an empty list), then #f is returned to the previous caller. Otherwise r represents the result of calling count-numbers-of-f-1 on the cdr of xs. If r is not a number (because the subsequent caller encountered a non-numeric element and returned #f), then #f is returned. Otherwise the additive process grows.

The upshot of this is that if an element that is not a number is encountered, #f is immediately passed back through all previous stack frames to the originating call, otherwise the summation is carried out. It is easy to make a mistake designing procedures like this so that they appear to work, but end up doing a lot of unnecessary work by traversing all of the input (or worse). See the end of the answer for a discussion of what could go wrong here.

Comparison

Either of the above definitions will work:

scratch.rkt> (count-numbers-or-f '(1 2 3 4))
4
scratch.rkt> (count-numbers-or-f '(1 2 x 3 4))
#f
scratch.rkt> (count-numbers-or-f-1 '(1 2 3 4))
4
scratch.rkt> (count-numbers-or-f-1 '(1 2 x 3 4))
#f

Here are a couple of traces of the second version:

scratch.rkt> (count-numbers-or-f-1 '(1 2 3 4 5))
>(count-numbers-or-f-1 '(1 2 3 4 5))
> (count-numbers-or-f-1 '(2 3 4 5))
> >(count-numbers-or-f-1 '(3 4 5))
> > (count-numbers-or-f-1 '(4 5))
> > >(count-numbers-or-f-1 '(5))
> > > (count-numbers-or-f-1 '())
< < < 0
< < <1
< < 2
< <3
< 4
<5
5

scratch.rkt> (count-numbers-or-f-1 '(1 2 3 x 4 5))
>(count-numbers-or-f-1 '(1 2 3 x 4 5))
> (count-numbers-or-f-1 '(2 3 x 4 5))
> >(count-numbers-or-f-1 '(3 x 4 5))
> > (count-numbers-or-f-1 '(x 4 5))
< < #f    ; returned from (count-numbers-or-f-1 '(x 4 5))
< <#f    ; returned from (count-numbers-or-f-1 '(3 x 4 5))
< #f    ; returned from (count-numbers-or-f-1 '(2 3 x 4 5))
<#f    ; returned from (count-numbers-or-f-1 '(1 2 3 x 4 5))
#f    ; the actual value printed to the REPL

You can see that when a failing element is encountered, the recursion is immediately ended, but control still has to pass back through the previous stack frames. With the version using call/cc the control simply escapes from the recursive process, without passing through any previous frames.

Here is a version of the first procedure using call/cc which is modified so that the recursive process can be traced:

(define (count-numbers-or-f xs)
  (call/cc
   (lambda (k)
     (loop xs k))))

(define (loop xs break)
  (cond ((null? xs) 0)
        ((number? (car xs))
         (+ 1 (loop (cdr xs) break)))
        (else (break #f))))

Here is a trace showing that when #f is returned, it is returned directly without passing through the previous stack frames. The loop procedure does not return at all when #f is encountered; instead loop calls the break procedure, passing #f to it. The break procedure jumps to the continuation of the computation, returning #f to that point. Since the continuation picks up the computation at the end of the count-numbers-or-f procedure, #f is just returned from count-numbers-or-f.

scratch.rkt> (count-numbers-or-f '(1 2 3 x 4 5))
>(count-numbers-or-f '(1 2 3 x 4 5))
>(loop '(1 2 3 x 4 5) #<procedure>)
> (loop '(2 3 x 4 5) #<procedure>)
> >(loop '(3 x 4 5) #<procedure>)
> > (loop '(x 4 5) #<procedure>)  ; loop does not return
<#f  ; the value returned by (count-numbers-or-f '(1 2 3 x 4 5))
#f  ; the actual value printed to the REPL

What Could Go Wrong?

A well-designed recursive procedure can do pretty well here, but it is easy to implement a poorly-designed solution that can have serious performance problems.

This section is about showing how things can go off the rails with a poorly thought-out design. All of the procedures below work in that they return correct results, but they have different performance characteristics, and the worst of them is practically unusable. The takeaway should be that it is important to, at a minimum, test new procedures over a range of anticipated inputs to check for surprises. Possibly a second takeaway should be that multiple recursive calls in a procedure definition should be thought about carefully. A great tool to assist in testing is trace; in Racket you can (require racket/trace) and then (trace my-procedure) to instrument my-procedure so that any calls to my-procedure are annotated with a procedure trace. This is how the traces in this answer have been generated.

We have already seen a trace of count-numbers-or-f-1 above, showing that when a failing element is encountered #f is passed back to the previous callers until the original call is reached.

Consider this poorly implemented version:

(define (count-numbers-or-f-2 xs)
  (if (null? xs)
      0
      (let ((x (car xs))
            (r (count-numbers-or-f-2 (cdr xs))))
        (if (and (number? x) (number? r))
            (+ 1 r)
            #f))))

Here, since count-numbers-or-f-2 is called before checking whether the result of the remainder of the computation is a number?, the recursive process continues until reaching the base case (null? xs). This does not help at all when failing input is encountered:

scratch.rkt> (count-numbers-or-f-2 '(1 2 3 x 4 5))
>(count-numbers-or-f-2 '(1 2 3 x 4 5))
> (count-numbers-or-f-2 '(2 3 x 4 5))
> >(count-numbers-or-f-2 '(3 x 4 5))
> > (count-numbers-or-f-2 '(x 4 5))
> > >(count-numbers-or-f-2 '(4 5))
> > > (count-numbers-or-f-2 '(5))
> > > >(count-numbers-or-f-2 '())
< < < <0
< < < 1
< < <2
< < #f
< <#f
< #f
<#f
#f

Now consider this truly terrible implementation that attempts to solve the last problem. Here the result of the rest of the computation is checked before adding it to the final result. This works when the input contains a failing element; but in cases where the entire input is counted count-numbers-or-f-3 is called twice, leading to exponential time complexity.

(define (count-numbers-or-f-3 xs)
  (if (null? xs)
      0
      (if (and (number? (car xs))
               (number? (count-numbers-or-f-3 (cdr xs))))
          (+ 1 (count-numbers-or-f-3 (cdr xs)))
          #f)))

With failing input things look fine:

scratch.rkt> (count-numbers-or-f-3 '(1 2 3 x 4 5))
>(count-numbers-or-f-3 '(1 2 3 x 4 5))
> (count-numbers-or-f-3 '(2 3 x 4 5))
> >(count-numbers-or-f-3 '(3 x 4 5))
> > (count-numbers-or-f-3 '(x 4 5))
< < #f
< <#f
< #f
<#f
#f

But with input that should be counted, things go horribly wrong. So wrong, in fact, that I had to use smaller input to illustrate the process in a reasonable amount of space:

>(count-numbers-or-f-3 (1 2 3))
> (count-numbers-or-f-3 '(2 3))
> >(count-numbers-or-f-3 '(3))
> > (count-numbers-or-f-3 '())
< < 0
> > (count-numbers-or-f-3 '())
< < 0
< <1
> >(count-numbers-or-f-3 '(3))
> > (count-numbers-or-f-3 '())
< < 0
> > (count-numbers-or-f-3 '())
< < 0
< <1
< 2
> (count-numbers-or-f-3 '(2 3))
> >(count-numbers-or-f-3 '(3))
> > (count-numbers-or-f-3 '())
< < 0
> > (count-numbers-or-f-3 '())
< < 0
< <1
> >(count-numbers-or-f-3 '(3))
> > (count-numbers-or-f-3 '())
< < 0
> > (count-numbers-or-f-3 '())
< < 0
< <1
< 2
<3
3

Here is a procedure that can be used to time the above procedures:

(define (time-test f xs)
  (collect-garbage 'major)
  (time (f xs)))

I have used a list of 800,001 numbers for testing, with an 'x placed in the middle of the list. The count-numbers-or-f using continuations performs the best:

scratch.rkt> (time-test count-numbers-or-f
                        (append (build-list 400000 values)
                                '(x)
                                (build-list 400000 values)))
cpu time: 1 real time: 1 gc time: 0
#f

Compared with count-numbers-or-f-1 which ends the recursion when failing input is encountered, the first version is a bit faster due to the immediate escape using a continuation:

scratch.rkt> (time-test count-numbers-or-f-1
                        (append (build-list 400000 values)
                                '(x)
                                (build-list 400000 values)))
cpu time: 5 real time: 5 gc time: 0
#f

The bad version which traverses all of the input even when failing input is encountered does worse yet:

scratch.rkt> (time-test count-numbers-or-f-2
                        (append (build-list 400000 values)
                                '(x)
                                (build-list 400000 values)))
cpu time: 28 real time: 28 gc time: 9
#f

It looks like the better non-escaping implementation, count-numbers-or-f-1, is about 5 times slower than count-numbers-or-f (using continuations), and the poorly implemented count-numbers-of-f-2 is about 28 times slower than count-numbers-or-f. But at least all of these implementations are roughly O(n) in time complexity. The worst version, count-numbers-or-f-3 is O(2ⁿ). There isn't enough time in the universe to wait for this procedure to operate on a list of 800,001 elements. We will have to time using much smaller inputs:

scratch.rkt> (time-test count-numbers-or-f-6 (build-list 25 values))
cpu time: 576 real time: 576 gc time: 0
25
scratch.rkt> (time-test count-numbers-or-f-6 (build-list 26 values))
cpu time: 1159 real time: 1159 gc time: 0
26
scratch.rkt> (time-test count-numbers-or-f-6 (build-list 27 values))
cpu time: 2314 real time: 2314 gc time: 0
27

Here you can see that increasing the size of the input list by 1 element roughly doubles the time to complete the computation in the worst cases, and those worst cases are precisely those cases in which you want the computation to complete, i.e., when you want to get a count of how many numbers are in a list.