I'm writing an RPC library for AVR and need to pass a function address to some inline assembler code and call the function from within the assembler code. However the assembler complains when I try to call the function directly.
This minimal example test.cpp illustrates the issue (in the actual case I'm passing args and the function is an instantiation of a static member of templated class):
void bar () {
return;
}
void foo() {
asm volatile (
"call %0" "\n"
:
: "p" (bar)
);
}
Compiling with avr-gcc -S test.cpp -o test.S -mmcu=atmega328p works fine but when I try to assemble with avr-gcc -c test.S -o test.o -mmcu=atmega328p avr-as complains:
test.c: Assembler messages:
test.c:38: Error: garbage at end of line
I have no idea why it writes "test.c", the file it is referring to is test.S, which contains this on line 38:
call gs(_Z3barv)
I have tried all even remotely sensible constraints on the paramter to the inline assembler that I could find here but none of those I tried worked.
I imagine if the gs() part was removed, everything should work, but all constraints seem to add it. I have no idea what it does.
The odd thing is that doing an indirect call like this assembles just fine:
void bar () {
return;
}
void foo() {
asm volatile (
"ldi r30, lo8(%0)" "\n"
"ldi r31, hi8(%0)" "\n"
"icall" "\n"
:
: "p" (bar)
);
}
The assembler produced looks like this:
ldi r30, lo8(gs(_Z3barv))
ldi r31, hi8(gs(_Z3barv))
icall
And avr-as doesn't complain about any garbage.
There are several issues with the code:
Issue 1: Wrong Constraint
The correct constraint for a call target is
"i", thus known at link-time.Issue 2: Wrong % print-modifier
In order to print an address suitable for a call, use
%xwhich will print a plain symbol withoutgs(). Generating a linker stub at this place by means ofgs()is not valid syntax, hence "garbage at end of line". Apart from that, as you are callingbardirectly, there is no need for linker stub (at least not for this kind of symbol usage).Issue 3:
callinstruction might not be availableTo factor out whether a device supports
callor justrcall, there is%~which prints a singlerif justrcallis available, and nothing ifcallis available.Issue 4: The Call might clobber Registers or have other Side-Effects
It's unlikely that the call has no effects on registers or on memory whatsoever. If you description of the inline asm does not match some side-effects of the code, it's likely that you will get wrong code sooner or later.
Taking it all together
Let's assume you have a function
barwritten in assembly that takes two 16-bit operands in R22 and R26, and computes a result in R22. This function does not obey the avr-gcc C/C++ calling convention, so inline assembly is one way to interface to such a function. Forbarwe cannot write a correct prototype anyways, so we just provide a prototype so that we can use symbolbar. Register X has constraint"x", but R22 has no own register constraint, and therefore we have to use a local asm register:Generated code for ATmega32 + optimization:
So what's that "generate stub"
gs()thing?Suppose the C/C++ code is taking the address of a function. The only sensible thing to do with it is to call that function, which will be an indirect call in general. Now an indirect call can target 64KiW = 128KiB at most, so that on devices with > 128KiB of code memory, special means must be taken to indirectly call a function beyond the 128KiB boundary. The AVR hardware features an SFR named
EINDfor that purpose, but problems using it are obvious. You'd have to set it prior to a call and then reset it somehow somewhere; all evil things would be necessary.avr-gcc takes a different approach: For each such address taken, the compiler generates
gs(func). This will just resolve tofuncif the address is in the 128KiB range. If not,gs()resolves to an address in section.trampolineswhich is located close to the beginning of flash, i.e. in the lower 128KiB..trampolinescontaints a list of directJMPs to targets beyond the 128KiB range.Take for example the following C code:
The __asm is used to keep the compiler from optimizing the indirect call to a direct one. Then run
For the matter of brevity, we just define symbol
far_funcper command line. The assembly dump inmain.sshows thatfar_funcmight require a linker stub:The final executable listing in
main.lstthen shows that the stub is actually generated and used:main loads Z=0x0072 which is a word address for byte address 0x00e4, i.e. the code is indirectly jumping to 0x00e4, and from there it jumps directly to 0x24680.