The dis.dis()
function in the dis
module allows one to disassemble raw bytecode from a code object into a human-readable form. The documentation for dis()
states what may be passed into the function:
Disassemble the
bytesource
object.bytesource
can denote either a module, a class, a method, a function, or a code object. For a module, it disassembles all functions. For a class, it disassembles all methods. For a single code sequence, it prints one line per bytecode instruction. If no object is provided, it disassembles the last traceback.
When experimenting with the module, everything works as expected. However I was surprised when I passed a str
object into function and it ran fine:
>>> import dis
>>> dis.dis('1 + 2')
0 <49>
1 SLICE+2
2 STORE_SLICE+3
3 SLICE+2
4 DELETE_SLICE+0
>>>
The documentation for dis()
specifically said what was allowed to be passed to the function. And the last time I checked, a str
does not satisfy any of those requirements. Per the documentation for code objects:
Code objects represent byte-compiled executable Python code, or bytecode. [...]
But what surprised me even more, was the bytecode that dis()
has generated. What does this bytecode mean? I decied to check and see what each opcode meant by looking at 32.12.1. Python Bytecode Instructions on the documentation page for dis
. This however confused me even more. Some opcodes were not even documented(<49>
).
What exactly is Python trying to do with the string I passed in. Is it considering my string to be literal source code? Or is attempting to construct a string from what I've passed in?
I would assume my former guess to be correct, due to looking over the source code for dis.disassemble_string()
(Which is whatdis.dis()
calls when a str
object is passed in). But if that is true, then why does the bytecode look so strange? If I pass in a function with the same expression, the bytecode makes perfect sense:
>>> def func():
... 1 + 2
...
>>> dis.dis(func)
2 0 LOAD_CONST 3 (3)
3 POP_TOP
4 LOAD_CONST 0 (None)
7 RETURN_VALUE
>>>
This behavior doesn't seem to be limited to expression, though. I tried several other statements, and they all generated the same weird-looking bytecode:
>>> dis.dis('foo = 10')
0 BUILD_TUPLE 28527
3 SLICE+2
4 DELETE_SUBSCR
5 SLICE+2
6 <49>
7 <48>
>>> dis.dis('print 0')
0 JUMP_IF_TRUE_OR_POP 26994
3 JUMP_FORWARD 8308 (to 8314)
6 <48>
>>> # etc...
Is this behavior documented somewhere? I looked over the entire documentation page for the dis
module, but I didn't find any relevant information.