This is a sort of a follow-up on an old answer to a question about the necessity of functools.partial : while that answer very clearly explains the phenomenon and the basic reason for it, there are still some unclear points to me.
To recap, the following Python code
myfuns = [lambda arg: str(arg) + str(clo) for clo in range(4)]
try :
clo
except NameError :
print("there is no clo")
for arg in range(4) :
print(myfuns[arg](arg), end=", ")
gives 03, 13, 23, 33, , while the similar OCaml code
let myfuns = Array.map (fun clo -> fun arg -> (string_of_int arg) ^ (string_of_int clo)) [|0;1;2;3|];;
(* there is obviously no clo variable here *)
for arg = 0 to 3 do
print_string (myfuns.(arg) arg); print_string ", "
done;;
gives 00, 11, 22, 33, .
I understand this is related to a different notion of closure applied to lambda arg: str(arg) + str(clo) and its correspondent fun arg -> (string_of_int arg) ^ (string_of_int clo).
In OCaml, the closure maps the identifier clo to the value of the variable clo in the outer scope at the time of creation of the closure. In Python, the closure somehow contains the variable clo per se, which explains that it gets affected by the incrementation caused by the for generator.
Is this correct ?
How is this done ? The clo variable does not exist in the global scope, as evidenced by my try/except construct. Generally, I would assume that the variable of a generator is local to it and so does not survive it. So, again, where is clo ? This answer gives insight about __closure__ but I still do not completely grasp how it manages to refer to the clo variable per se during the generation.
Also, beside this strange behaviour (for people used to statically binding languages), are there other caveats one should be aware of ?
When Python creates a closure is collects all free variables into a tuple of cells. Since each cell is mutable, and Python passes a reference to the cell into the closure, you will see the last value of the induction variable in your loop. Let's look underneath the hood, here is our function, with
ioccurring free in our lambda expression,and here is the disassembly of this function
We can see that
STORE_DEREFon16takes a normal integer value from the top of the stack (TOS) and stores it with theSTORE_DEREFin a cell. The next three commands prepare the closure structure on the stack, and finallyMAKE_CLOSUREpacks everything into the closure, which is represented as a tuple (in our case 1-tuple) of cells,so it is a tuple with a cell containing an int,
The crucial to the understanding point here, is that free variables are shared by all closures,
As each cell is a reference to a local variable in the enclosing function scope, indeed, we can find the
ivariable in themake_closuresfunction, in thecellvarsattribute,Therefore, we have a little bit? surprising effect of an integer value being passed by reference and becoming mutable. The main surprise in Python is the way how variables are packed and that the for loop is not having its own scope.
To be fair, you can achieve the same result in OCaml if you manually create a reference and capture it in a closure. e.g.,
so that
Historical References
Both OCaml and Python are influenced by Lisp and both imply the same technique for implementing closures. Surprisingly with different results, but not due to different interpretations of lexical scoping or closure environment but due to different object(data) models of the two languages.
The OCaml data model is not only simpler to understand but is also well defined by the rigorous type system. Python, due to its dynamic structure, leaves a lot of freedom in the interpretation of objects and their representation. Therefore, in Python, they decided to make variables bound in the lexical context of a closure mutable by default (even if they are integers). See also the PEP-227 for more context.