What is the dis function doing when I pass in a string?

99 Views Asked by At

The dis.dis() function in the dis module allows one to disassemble raw bytecode from a code object into a human-readable form. The documentation for dis() states what may be passed into the function:

Disassemble the bytesource object. bytesource can denote either a module, a class, a method, a function, or a code object. For a module, it disassembles all functions. For a class, it disassembles all methods. For a single code sequence, it prints one line per bytecode instruction. If no object is provided, it disassembles the last traceback.

When experimenting with the module, everything works as expected. However I was surprised when I passed a str object into function and it ran fine:

>>> import dis
>>> dis.dis('1 + 2')
          0 <49>           
          1 SLICE+2        
          2 STORE_SLICE+3  
          3 SLICE+2        
          4 DELETE_SLICE+0 
>>> 

The documentation for dis() specifically said what was allowed to be passed to the function. And the last time I checked, a str does not satisfy any of those requirements. Per the documentation for code objects:

Code objects represent byte-compiled executable Python code, or bytecode. [...]

But what surprised me even more, was the bytecode that dis() has generated. What does this bytecode mean? I decied to check and see what each opcode meant by looking at 32.12.1. Python Bytecode Instructions on the documentation page for dis. This however confused me even more. Some opcodes were not even documented(<49>).

What exactly is Python trying to do with the string I passed in. Is it considering my string to be literal source code? Or is attempting to construct a string from what I've passed in?

I would assume my former guess to be correct, due to looking over the source code for dis.disassemble_string()(Which is whatdis.dis() calls when a str object is passed in). But if that is true, then why does the bytecode look so strange? If I pass in a function with the same expression, the bytecode makes perfect sense:

>>> def func():
...     1 + 2
... 
>>> dis.dis(func)
  2           0 LOAD_CONST               3 (3)
              3 POP_TOP             
              4 LOAD_CONST               0 (None)
              7 RETURN_VALUE        
>>> 

This behavior doesn't seem to be limited to expression, though. I tried several other statements, and they all generated the same weird-looking bytecode:

>>> dis.dis('foo = 10')
          0 BUILD_TUPLE     28527
          3 SLICE+2        
          4 DELETE_SUBSCR  
          5 SLICE+2        
          6 <49>           
          7 <48>           
>>> dis.dis('print 0')
          0 JUMP_IF_TRUE_OR_POP 26994
          3 JUMP_FORWARD     8308 (to 8314)
          6 <48>           
>>> # etc...

Is this behavior documented somewhere? I looked over the entire documentation page for the dis module, but I didn't find any relevant information.

0

There are 0 best solutions below