Performance Impact of Creating Enclosed Procedure in Scheme During Recursion

105 Views Asked by At

I'm making my way through the book The Little Schemer to start to learn to think in Lisp. As you get into it and really cover the use of lambdas, the 'remove' procedure is written in the following general form, which returns a remove procedure for arbitrary test test?:

(define rember-f
  (lambda (test?)
    (lambda (a l)
      (cond
        ((null? l) (quote ()))
        ((test? (car l) a) (cdr l))
        (else (cons (car l)
                ((rember-f test?) a (cdr l))))))))

I understand how this works just fine, but a plain reading of it suggests that at each recursive step, it is the procedure rember-f that is called again to generate a new enclosed procedure. This would mean when you call your returned procedure on a list, it calls rember-f to generate the same procedure again anew and then that new one is what is called for recursion (if that is not clear see my fix below). I understand that this may be optimized away, but in lieu of not knowing whether it is (and also in attempting to get my head around this syntax anyway), I managed after some experimentation to move the recursion to the procedure itself rather than the enclosing procedure as follows:

(define rember-f
  (lambda (test?)
    (define retfun
      (lambda (a l)
        (cond
          ((null? l) (quote ()))
          ((test? (car l) a) (cdr l))
          (else (cons (car l) (retfun a (cdr l)))))))
    retfun))

I have verified that this works as expected. The return value is a procedure that removes the first element of a list (arg 2) matching a value (arg 1). It looks to me like this one only calls rember-f once, which guarantees it only generates one enclosed procedure (this time with a name, retfun).

This is actually interesting to me because unlike the usual tail call optimization, which is about not consuming space on the call stack and so making recursion about as efficient as iteration, in this case the compiler would have to determine that (rember-f test?) is the enclosing procedure scope unmodified and so replace it with the same return value, which is the anonymous (lambda (a l) ...). It would not surprise me at all to learn that the interpreter / compiler does not catch this.

Yes, I know that scheme is a specification and there are many implementations, which get the various functional programming optimizations right to differing degrees. I am currently learning by experimenting in the guile REPL, but would be interested in how different implementations compare on this issue.

Does anyone know how Scheme is supposed to behave in this instance?

3

There are 3 best solutions below

3
Mulan On BEST ANSWER

You are right to be concerned about the additional repeated lambda abstractions. For example you wouldn't write this, would you?

(cond ((> (some-expensive-computation x) 0) ...)
      ((< (some-expensive-computation x) 0) ...)
      (else ...))

Instead we bind the result of some-expensive-computation to an identifier so we can check multiple conditions on the same value -

(let ((result (some-expensive-computation x)))
 (cond ((> result 0) ...)
       ((< result 0) ...)
       (else ...)))

You discovered the essential purpose of so-called "named let" expressions. Here's your program -

(define rember-f
  (lambda (test?)
    (define retfun
      (lambda (a l)
        (cond
          ((null? l) (quote ()))
          ((test? (car l) a) (cdr l))
          (else (cons (car l) (retfun a (cdr l)))))))
    retfun))

And its equivalent using a named-let expression. Below we bind the let body to loop, which is a callable procedure allowing recursion of the body. Notice how the lambda abstractions are used just once, and the inner lambda can be repeated without creating/evaluating additional lambdas -

(define rember-f
  (lambda (test?)
    (lambda (a l)
      (let loop ; name, "loop", or anything of your choice
       ((l l))  ; bindings, here we shadow l, or could rename it
       (cond
         ((null? l) (quote ()))
         ((test? (car l) a) (cdr l))
         (else (cons (car l) (loop (cdr l))))))))) ; apply "loop" with args

Let's run it -

((rember-f eq?) 'c '(a b c d e f))
'(a b d e f)

The syntax for named-let is -

(let proc-identifier ((arg-identifier initial-expr) ...)
  body ...)

Named-let is a syntax sugar of a letrec binding -

(define rember-f
  (lambda (test?)
    (lambda (a l)
      (letrec ((loop (lambda (l)
                       (cond
                         ((null? l) (quote ()))
                         ((test? (car l) a) (cdr l))
                         (else (cons (car l) (loop (cdr l))))))))
        (loop l)))))
((rember-f eq?) 'c '(a b c d e f))
'(a b d e f)

Similarly, you could imagine using a nested define -

(define rember-f
  (lambda (test?)
    (lambda (a l)
      (define (loop l)
        (cond
          ((null? l) (quote ()))
          ((test? (car l) a) (cdr l))
          (else (cons (car l) (loop (cdr l))))))
      (loop l))))
((rember-f eq?) 'c '(a b c d e f))
'(a b d e f)

PS, you you can write '() in place of (quote ())

8
Mark Saving On

Both procedures have the same asymptotic time complexity. Let's consider the evaluation of ((rember-f =) 1 '(5 4 3 2 1 0)).

A partial evaluation proceeds as follows:

((rember-f =) 1 '(5 4 3 2 1 0))
((lambda (a l)
      (cond
        ((null? l) (quote ()))
        ((= (car l) a) (cdr l))
        (else (cons (car l)
                ((rember-f =) a (cdr l)))))) 1 '(5 4 3 2 1 0))
(cons 5 ((rember-f = 1 '(4 3 2 1 0))))

Note that the creation of the temporary lambda procedure takes O(1) time and space. So it doesn't actually add any substantial overhead to the cost of calling the function. At best, factoring out the function will lead to a constant-factor speedup and the use of a constant amount less of memory.

But how much memory does it really take to make a closure? It turns out it takes very little memory. A closure consists of a pointer to the environment and a pointer to compiled code. Basically, creating the closure requires as much time and space as making a cons cell. So even though it looks like we're using a lot of memory when I show the evaluation, very little memory and very little time is actually used to make and store the lambda.

So essentially, by factoring out the recursive function, you've allocated a single cons cell rather than writing code which allocates that cons cell one time per recursive call.

For more information on this, see Lambda is cheap, and Closures are Fast.

5
alinsoar On

to start to learn to think in Lisp

That book is not about thinking in lisp, but about recursive thinking, which is one of the ways of computation discovered in the 20th century by Goedel, Herbrand, Rozsa Peter.

Does anyone know how Scheme is supposed to behave in this instance?

After you finish the little lisper you should take the SICP, which will make you understand what kind of decisions an implementation of a language can make. You mean, how different implementations act. To understand their implementation decision, the best step to do is to learn it from SICP. Take care, unless you are already a certified computer science graduate, this texbook will take you a few years to master, if you study it each day. If you are already a graduate, it will take you only about 1 year to master.