LLVM Trampoline causing SIGSEGV?

561 Views Asked by At

After reading up on generating closures in LLVM using trampolines I tried my hand at compiling some of the examples of trampolines that are floating around the internet (specifically this one). The LLVM IR given in the gist is as follows:

declare void @llvm.init.trampoline(i8*, i8*, i8*);
declare i8* @llvm.adjust.trampoline(i8*);

define i32 @foo(i32* nest %ptr, i32 %val) {
    %x = load i32* %ptr
    %sum = add i32 %x, %val
    ret i32 %sum
}

define i32 @main(i32, i8**) {
    %closure = alloca i32
    store i32 13, i32* %closure
    %closure_ptr = bitcast i32* %closure to i8*

    %tramp_buf = alloca [32 x i8], align 4
    %tramp_ptr = getelementptr [32 x i8]* %tramp_buf, i32 0, i32 0
    call void @llvm.init.trampoline(
            i8* %tramp_ptr,
            i8* bitcast (i32 (i32*, i32)* @foo to i8*),
            i8* %closure_ptr)
    %ptr = call i8* @llvm.adjust.trampoline(i8* %tramp_ptr)
    %fp = bitcast i8* %ptr to i32(i32)*
    %res = call i32 %fp (i32 13)

    ret i32 %res
}

Compiling this using clang trampolines.ll and executing it however, results in a SIGSEGV (the exact error that fish gives is fish: Job 1, './a.out ' terminated by signal SIGSEGV (Address boundary error)).

After some testing, it turned out that the calling of the "trampolined" function is the instruction causing the SIGSEGV, because commenting that out (and returning a dummy value) worked fine.

The problem does not seem to lie with clang either, because manually running llvm-as, llc and the like does not work either. Compiling on another machine is also not working. This leads me to believe that either my machine or LLVM is doing something wrong.

My clang version:

Apple LLVM version 6.1.0 (clang-602.0.49) (based on LLVM 3.6.0svn)
Target: x86_64-apple-darwin14.3.0
Thread model: posix
3

There are 3 best solutions below

0
On BEST ANSWER

Alright, more than a year later, and with the help of @user855, I finally have a working example.

As user855 noted in the comments, the code fails because the memory used to store the trampoline is not executable. This can be circumvented by using mmap to allocate executable memory instead (note that this is not memory on the stack, as opposed to before).

The code:

declare void @llvm.init.trampoline(i8*, i8*, i8*)
declare i8* @llvm.adjust.trampoline(i8*)
declare i8* @"\01_mmap"(i8*, i64, i32, i32, i32, i64)

define i32 @foo(i32* nest %ptr, i32 %val) {
    %x = load i32, i32* %ptr
    %sum = add i32 %x, %val
    ret i32 %sum
}

define i32 @main(i32, i8**) {
    %closure = alloca i32
    store i32 13, i32* %closure
    %closure_ptr = bitcast i32* %closure to i8*

    %mmap_ptr = call i8* @"\01_mmap"(i8* null, i64 72, i32 7, i32 4098, i32 0, i64 0)

    call void @llvm.init.trampoline(
            i8* %mmap_ptr,
            i8* bitcast (i32 (i32*, i32)* @foo to i8*),
            i8* %closure_ptr)

    %ptr = call i8* @llvm.adjust.trampoline(i8* %mmap_ptr)
    %fp = bitcast i8* %ptr to i32(i32)*
    %res = call i32 %fp (i32 13)

    ret i32 %res
}

The mmap call arguments are as follows: mmap(NULL, 72, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0). Note that the mmap function name is "\01_mmap" on my platform, it may differ on yours. To check, simply compile some code using clang -S -emit-llvm and note the mmap call.

Another interesting note is that this code requires the allocated trampoline to be released after, using munmap(ptr, 72).

0
On

This is completely expected. The LLVM trampoline intrinsics are not really for random frontend use.

The tramp argument must point to a sufficiently large and sufficiently aligned block of memory; this memory is written to by the intrinsic. Note that the size and the alignment are target-specific - LLVM currently provides no portable way of determining them, so a front-end that generates this intrinsic needs to have some target-specific knowledge.

This basically implies that you have no way of writing a use of the trampoline instruction that is guaranteed to work. You can't just take a random sample from the Internet. You need in-depth knowledge of how trampoline is implemented for your specific target.

That sample does not even say what target it's supposed to be for, let alone how things may have changed since whatever LLVM version it was written against, etc.

0
On

Your observation of improperly mmaped memory without execution permission is indeed the case. I've encountered SIGSEGV due to the same issue while migrating from LLVM 3.5 to LLVM 3.7.

Turns out, LLVM JIT's SectionMemoryManager strategy is to initially mmap regions in allocateCodeSection() with MF_READ and MF_WRITE permissions. The MF_EXEC is applied later (jointly with revoking MF_WRITE) in SectionMemoryManager::finalizeMemory(), right at the point when the concerned function is about to be called.

So this finalizeMemory() seals the region for execution only whenever it is absolutely required, I guess for security reasons. The user is not supposed to seal memory by himself, instead it is triggered by MCJIT::getFunctionAddress() - the function that should be used for getting function out of a JIT-ed module.

Problem of LLVM 3.5 -> 3.7 upgrade is that instead of MCJIT::getFunctionAddress(), the MCJIT::getPointerToFunction() is still used, which does NOT perform the finalizeMemory() above.

Solution: replace all calls to MCJIT::getPointerToFunction() with MCJIT::getFunctionAddress() as suggested in lib/ExecutionEngine/SectionMemoryManager.cpp below:

uint64_t MCJIT::getFunctionAddress(const std::string &Name) {
  MutexGuard locked(lock);
  uint64_t Result = getSymbolAddress(Name, true);
  if (Result != 0)
    finalizeLoadedModules();
  return Result;
}

// Deprecated.  Use getFunctionAddress instead.
void *MCJIT::getPointerToFunction(Function *F) {
...