Marking segments of generated sparc assembly code for inspection

182 Views Asked by At

Does anybody know how to insert recognizable code sequences using the Sun Studio compiler, without horribly messing up optimization?

I'd like to look to see what the Sun Studio (12.1) compiler does with a bit of code in a number of instances, and was looking for a way to mark the generated code with a recognizable set of no-op instructions so I could find my fragments of code. My first try used:

asm volatile ("nop ; nop ; nop ") ;
// ... <stuff I want to look at here> ...
asm volatile ("nop ; nop ; nop ; nop ; nop") ;

However, when I use this, the compiler generates unoptimized looking code within the nop blocks. Example:

nop
nop
nop
ld        [%sp + 0x8bf], %g2
srl       %g2, 0x0, %g3
sllx      %g3, 0x2, %g4
ld        [%sp + 0x8c3], %g5
ldx       [%sp + 0x8c7], %o2
st        %g5, [%o2 + %g4]
ld        [%sp + 0x8b7], %o3
ldx       [%sp + 0x8c7], %o4
st        %o3, [%o4 + 0x28]
nop
nop
nop
nop
nop

The code in question is just two stores. I don't really know sparc assembly, but this looks like the compiler has completely given up on optimizing the code within the nop blocks. Why, for example, would it generate a new load, the ldx [%sp + 0x8c7], %o4, recalculating the base address for the store when it already had this done in %02?

At a glance at the surrounding code, it may very well be unoptimized anywhere in the vicinity of the asm volatiles used.

I tried the following instead, creating a .il file with this inline asm:

.inline DO_Nop3,0
   nop
   nop
   nop
.end
.inline DO_Nop5,0
   nop
   nop
   nop
   nop
   nop
.end

with the following in my source:

extern "C" void DO_Nop3() ;
extern "C" void DO_Nop5() ;

Using this, I've got the opposite problem, the compiler is now too smart, and eliminates my nop instructions completely (I'm guessing it looks at the side effects of the instructions in the .inline blocks, and then later, rightly, decides this doesn't do anything, and tosses that bit of code).

Any better ways?

1

There are 1 best solutions below

2
On

The problem is that the compiler is free to reorder instructions; the asm volatile blocks stops it from doing so and potentially inhibits optimization.

Debugging symbols should give you a mapping from instruction addresses to source lines. I'm not aware of any good tools for conveniently reading dwarf2/stabs, though.