I am getting stuck with the Wikipedia description of the predecessor function in lambda calculus.
What Wikipedia says is the following:
PRED := λn.λf.λx. n (λg.λh. h (g f)) (λu.x) (λu.u)
Can someone explain reduction processes step-by-step?
Thanks.
Ok, so the idea of Church numerals is to encode "data" using functions, right? The way that works is by representing a value by some generic operation you'd perform with it. We can therefore go in the other direction as well, which can sometimes make things clearer.
Church numerals are a unary representation of the natural numbers. So, let's use
Zto mean zero andSnto represent the successor ofn. Now we can count like this:Z,SZ,SSZ,SSSZ... The equivalent Church numeral takes two arguments--the first corresponding toS, and second toZ--then uses them to construct the above pattern. So given argumentsfandx, we can count like this:x,f x,f (f x),f (f (f x))...Let's look at what PRED does.
First, it creates a lambda taking three arguments--
nis the Church numeral whose predecessor we want, of course, which means thatfandxare the arguments to the resulting numeral, which thus means that the body of that lambda will befapplied toxone time fewer thannwould.Next, it applies
nto three arguments. This is the tricky part.The second argument, that corresponds to
Zfrom earlier, isλu.x--a constant function that ignores one argument and returnsx.The first argument, that corresponds to
Sfrom earlier, isλgh.h (g f). We can rewrite this asλg. (λh.h (g f))to reflect the fact that only the outermost lambda is being appliedntimes. What this function does is take the accumulated result so far asgand return a new function taking one argument, which applies that argument togapplied tof. Which is absolutely baffling, of course.So... what's going on here? Consider the direct substitution with
SandZ. In a non-zero numberSn, thencorresponds to the argument bound tog. So, remembering thatfandxare bound in an outside scope, we can count like this:λu.x,λh. h ((λu.x) f),λh'. h' ((λh. h ((λu.x) f)) f)... Performing the obvious reductions, we get this:λu.x,λh. h x,λh'. h' (f x)... The pattern here is that a function is being passed "inward" one layer, at which point anSwill apply it, while aZwill ignore it. So we get one application offfor eachSexcept the outermost.The third argument is simply the identity function, which is dutifully applied by the outermost
S, returning the final result--fapplied one fewer times than the number ofSlayersncorresponds to.