I am trying to profile an x86 Assembly program using Ubuntu 12.04. I'd like to use the rdtsc function. The problem is, according to a comment, that I should get the number of cycles in rdx but with the following code I get a too high number:
SECTION .bss
SECTION .dat
SECTION .text
global main
main:
nop
cpuid
rdtsc
shl rdx, 32
or rdx, rax
mov r8, rdx
xor esi,esi
mov esi,19 ; instructions to be monitored
cpuid
rdtsc
shl rdx, 32
or rdx, rax
sub rdx, r8
Running it in a debugger I get the following results on registers after the sub instruction:
rax 0xd88102bc
rbx 0x0
rcx 0xf0
rdx 0x44f3914a0
rsi 0x13
rdi 0x1
rbp 0x0
rsp 0x7fffffffdf38
r8 0x11828947ee1c
I can't figure out why the number of cycles in rdx is so high for so simple instructions. Is the right number in rcx? Isn't it too high too?
Thanks in advance
I'm not sure what's happening, but when you're calling C functions from assembler you should usually prefix them with a leading underscore, for example
call _clock
. This is because the C compiler will add this prefix to all functions it generates.Additionally as you're on a 64-bit architecture the 64-bit result should end up in
rax
, you should ensure you're looking at that, noteax
andebx
.Finally I'd suggest rather than using
clock
you should use the assembler instructionrdtsc
. This will return a 64-bit result inedx:eax
. It's relative rather than absolute and is measured in cycles rather than some fractions of seconds, but it should be exactly what you need for profiling.Example:
This will leave the number of ticks that elapsed in
rdx
. Thecpuid
instructions are to prevent the processor from reordering instructions around the profiling points.