When does a program "intertwine definition and use"?

232 Views Asked by At

Footnote #28 of SICP says the following:

Embedded definitions must come first in a procedure body. The management is not responsible for the consequences of running programs that intertwine definition and use.

What exactly does this mean? I understand:

  • "definitions" to be referring to both procedure creations and value assignments.
  • "a procedure body" to be as defined in section 1.1.4. It's the code that comes after the list of parameters in a procedure's definition.
  • "Embedded definitions must come first in a procedure body" to mean 'any definitions that are made and used in a procedure's body must come before anything else'.
  • "intertwine definition and use" to mean 'calling a procedure or assigned value before it has been defined;'

However, this understanding seems to contradict the answers to this question, which has answers that I can summarise as 'the error that your quote is referring to is about using definitions at the start of a procedure's body that rely on definitions that are also at the start of that body'. This has me triply confused:

  1. That interpretation clearly contradicts what I've said above, but seems to have strong evidence - compiler rules - behind it.
  2. SICP seems happy to put definition in a body with other definitions that use them. Just look at the sqrt procedure just above the footnote!
  3. At a glance, it looks to me that the linked question's author's real error was treating num-prod like a value in their definition of num rather than as a procedure. However, the author clearly got it working, so I'm probably wrong.

So what's really going on? Where is the misunderstanding?

3

There are 3 best solutions below

11
On BEST ANSWER

In a given procedure's definition / code,

  • "internal definitions" are forms starting with define.
  • "a procedure body" is all the other forms after the forms which start with define.
  • "embedded definitions must come first in a procedure body" means all the internal define forms must go first, then all the other internal forms. Once a non-define internal form appears, there can not appear an internal define form after that.
  • "no intertwined use" means no use of any name before it's defined. Imagine all internal define forms are gathered into one equivalent letrec, and follow its rules.

Namely, we have,

(define (proc args ...)
   ;; internal, or "embedded", definitions
   (define a1 ...init1...)
   (define a2 ...init2...)
   ......
   (define an ...initn...)
   ;; procedure body
   exp1 exp2 .... )

Any ai can be used in any of the initj expressions, but only inside a lambda expression.(*) Otherwise it would refer to the value of ai while aj is being defined, and that is forbidden, because any ai names are considered not yet defined while any of the initj expressions are being evaluated.


(*) Remember that (define (foo x) ...x...) is the same as (define foo (lambda (x) ...x...)). That's why the definitions in that sqrt procedure in the book you mention are OK -- they are all lambda expressions, and any name's use inside a lambda expression will only actually refer to that name's value when the lambda expression's value -- a lambda function -- will be called, not when that lambda expression is being evaluated, producing that lambda function which is its value.


The book is a bit vague at first with the precise semantics of its language but in Scheme the above code is equivalent to

(define proc 
   (lambda (args ...)
      ;; internal, or "embedded", definitions
      (letrec ( (a1 ...init1...)
                (a2 ...init2...)
                ......
                (an ...initn...) )
        ;; procedure body
        exp1 exp2 .... 
        )))

as can be seen explained, for instance, here, here or here.


For example,

                                   ;; or, equivalently,
(define (my-proc x)                (define my-proc
                                     (lambda (x)
   (define (foo) a)                     (letrec ( (foo (lambda () a))
   (define a x)                                   (a x) )
   ;; my-proc's body                       ;; letrec's body
   (foo))                                  (foo))))

first evaluates the lambda expression, (lambda () a), and binds the name foo to the result, a function; that function will refer to the value of a when (foo) will be called, so it's OK that there's a reference to a inside that lambda expression -- because when that lambda expression is evaluated, no value of a is immediately needed, just the reference to its future value, under the name a, is present there; i.e. the value that a will have after all the names in that letrec are initialized, and the body of letrec is entered. Or in other words, when all the internal defines are completed and the body proper of the procedure my-proc is entered.

So we see that foo is defined, but is not used during the initializations; a is defined but is not used during the initializations; thus all is well. But if we had e.g.

(define (my-proc x)
   (define (foo) x)    ; `foo` is defined as a function
   (define a (foo))    ; `foo` is used by being called as a function
   a)

then here foo is both defined and used during the initializations of the internal, or "embedded", defines; this is forbidden in Scheme. This is what the book is warning about: the internal definitions are only allowed to define stuff, but its use should be delayed for later, when we're done with the internal defines and enter the full procedure's body.

11
On

You discovered one of the difficulties of scheme. And of lisp. Due to this kind of chicken and egg issue there is no single lisp, but lots of lisps appeared.

If the binding forms are not present in code in a letrec-logic in R5RS and letrec*-logic in R6RS, the semantics is undefined. Which means, everything depend on the will of the implementor of scheme.

See the paper Fixing Letrec: A Faithful Yet Efficient Implementation of Scheme’s Recursive Binding Construct.

Also, you can read the discussions from the mailing list from 1986, when there was no general consensus among different implementors of scheme.

Also, at MIT there were developed 2 versions of scheme -- the student version and the researchers' development version, and they behaved differently concerning the order of define forms.

1
On

This is subtle, and as the footnote and the question you reference imply, the subtlties can vary depending on a particular language's implementation.

These issues will be covered in far more detail later in the book (Chapters 3 and 4) and, generally, the text avoids using internal definitions so that these issues can be avoided until they are examined in detail.

A key difference between between the code above the footnote:

(define (sqrt x)
  (define (good-enough? guess)
    (< (abs (- (square guess) x)) 0.001))
  (define (improve guess)
    (average guess (/ x guess)))
  (define (sqrt-iter guess)
    (if (good-enough? guess)
        guess
        (sqrt-iter (improve guess))))
  (sqrt-iter 1.0))

and the code in the other question:

(define (pi-approx n)
  (define (square x) (* x x))

  (define (num-prod ind) (* (* 2 ind) (* 2 (+ ind 1)))) ; calculates the product in the numerator for a certain term
  (define (denom-prod ind) (square (+ (* ind 2 ) 1))) ;Denominator product at index ind

  (define num (product num-prod 1 inc n))
  (define denom (product denom-prod 1 inc n))

is that all the defintions in former are procedure definitions, whereas num and denom are value defintions. The body of a procedure is not evaluated until that procedures is called. But a value definition is evaluated when the value is assigned.

With a value definiton:

(define sum (add 2 2))

(add 2 2) will evaluated when the definition is evaluated, if so add must already be defined. But with a procedure definition:

(define (sum n m) (add n m))

a procedure object will be assigned to sum but the procedure body is not yet evaluated so add does not need be defined when sum is defined, but must be by the time sum is called:

(sum 2 2)

As I say there is a lot sublty and a lot of variation so I'm not sure if the following is always true for every variation of scheme, but within 'SICP scheme' you can say..

Valid (order of evaluation of defines not significant):

;procedure body
(define (sum m n) (add m n))
(define (add m n) (+ m n))
(sum 2 2)

Also valid:

;procedure body
(define (sum) (add 2 2))
(define (add m n) (+ m n))
(sum)

Usually invalid (order of evaluation of defines is significant):

;procedure body
(define sum (add 2 2))
(define (add m n) (+ m n))

Whether the following is valid depends on the implementation:

;procedure body
(define (add m n) (+ m n))
(define sum (add 2 2))

And finally an example of intertwining definition and use, whether this works also depends on the implementation. IIRC, this would work with Scheme described in Chapter 4 of the book if the scanning out has been implemented.

;procedure body
(sum)
(define (sum) (add 2 2))
(define (add m n) (+ m n))

It is complicated and subtle, but the key points are:

  1. value definitions are evaluated differently from procedure definitions,
  2. behaviour inside blocks may be different from outside blocks,
  3. there is variation between scheme implementations,
  4. the book is designed so you don't have to worry much about this until Chapter 3,
  5. the book will cover this in detail in Chapter 4.