The following code:
func pow4(n) -> (m : felt):
alloc_locals
local x
jmp body if n != 0
[ap] = 0; ap++
ret
body:
x = n * n
[ap] = x * x; ap++
ret
end
func main():
pow4(n=5)
ret
end
is declared inefficient in the doc because of non-continuous memory.
I ran it and could not see any hole in the memory table:
Addr Value
-----------
⋮
###
⋮
1:0 2:0
1:1 3:0
1:2 5
1:3 1:2
1:4 0:21
1:5 25
1:6 625
so I don't understand where the problem is. I do see a hole with n=0
though:
Addr Value
-----------
⋮
###
⋮
1:0 2:0
1:1 3:0
1:2 0
1:3 1:2
1:4 0:21
⋮
1:6 0
that I can fix using:
jmp body if n != 0
x = 0
ret
but it's not what's suggested:
- Move the instruction alloc_locals.
- or Use tempvar instead of local.
You are correct that the inefficiency is due to the memory hole, and correct that it appears only for n=0. Holes do not cause an inefficiency just by existing, but rather, their existence usually means that an equivalent code could have executed using fewer memory cells in total (in this case, 6 instead of 7, in the memory section "1:"). To make it more efficient, one should aim to remove the hole (so that the parts before and after it become continuous), whereas your suggested solution just fills it (which the prover does anyway). So your solution would still use 7 memory cells in the memory segment; and will in fact also use 2 additional cells in the program section (for the
x=0
command), so it is actually less efficient than leaving the hole empty: compile and check!The inefficiency in the original code arises from the local variable x being assigned a memory cell even in the case n=0, despite not being used. To make it more efficient we would simply not want to declare and allocate x in this case. This can be done by moving the
alloc_locals
command inside "body", so that it isn't executed (and the locals aren't allocated) in the n=0 case, as in the first suggestion: this saves one memory cell in the n=0 case and doesn't affect the n!=0 case. Note thatalloc_locals
does not have to appear right at the beginning of the code.The second suggestion is to make x a tempvar instead of a local (and remove
alloc_locals
entirely). This again means no local variable is allocated in the n=0 case, thus again only 6 memory cells will be used instead of 7. And in fact, because the code is so simple, using the tempvar is also more efficient that declaring it as a local, at least when used astempvar x = n * n
, as it skips merges theap += 1
command (which is whatalloc_locals
does in this case) with thex = n * n
command, rather than run them separately.Beyond discussing the theory, you should compile each of these options and compare the lengths of the code and memory segments obtained, and see what is really the most efficient and by how much: this is always true when optimizing, you should always check the actual performance.