valgrind: obtain address of unininitialized memory

264 Views Asked by At

I'm debugging a problem which occurs only in the PPC64 port of my program.

I have a test case where the C library qsort is given a libffi-generated closure as a string comparison callback. Strings are passed to the callback properly, and the return value is stored precisely into the return value buffer passed by libffi to the closure function.

However, the array is not correctly sorted by qsort. Moreover, Valgrind reports that the C library qsort code is accessing uninitialized memory, and --track-orgins=yes reveals that this memory was stack-allocted by Libffi. I strongly suspect that this is the return value, and so the sorting isn't correct due to garbage comparisons.

I.e. Libffi allocated the buffer for the return value, and is propagating the value of that to the callback caller; but my closure dispatch function is being given the wrong pointer, and so is not putting the return value in the right place.

For some bizarre reason, Valgrind doesn't report the address of the uninitialized memory, only where in the code the use occurred and where it was allocated.

I simply want to compare the address of that location to the pointer that is passed to the closure function: are they even remotely close?

Is there some way to get that information out of Valgrind?


UPDATE: I'm working on a GCC Compile Farm machine where I don't have root; the installed libffi has no debug info. It is version 3.0.13.

However, the issue reproduces with the libffi git head which I just built.

I have confirmed it is the return value area that is uninitialized.

I added an instruction to the closure dispatching assembly code ffi_closure_LINUX64 to initialize a double-word-sized area at the bottom of the RETVAL part of the closure dispatch stack frame. This makes the Valgrind error go away; but of course the return value is garbage. It also confirms a basic piece of sanity: that the code before the call to the closure dispatch helper and the code after are referring to the same area for the return value. (The stack pointer hasn't moved unexpectedly and the frame references are correct.) Just whatever address the user code is ultimately getting isn't pointing to that return value.

Next, I moved the initialization of the return area down into the C function called ffi_closure_helper_LINUX64, near the entry into the function. This also still makes the uninitialized error go away, confirming the helper is getting the correct return value area address through %r6 (argument 4).

3

There are 3 best solutions below

0
On

There is no feature in valgrind to report the address of the uninit memory, as this would (in most case) not help the user : a stack address or heap address cannot really indicate much.

You might maybe have some more info by setting a breakpoint in the frame reported by Valgrind, and mark various pieces of the stack as initialised, using gdb+vgdb+memcheck monitor commands. When setting the faulty location to initialised, valgrind should not report the error anymore. You might have to do several runs, each time marking other vars/zone of the stack.

See http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.monitor-commands and GDB user manual to see how to write (sophisticated) commands run when a breakpoint is reached.

0
On

For some bizarre reason, Valgrind doesn't report the address of the uninitialized memory, only where in the code the use occurred and where it was allocated.

This is documented behavior of Valgrind Memcheck tool, see this part of manual about --track-orgins=yes:

For uninitialised values originating from a stack allocation, Memcheck can tell you which function allocated the value, but no more than that -- typically it shows you the source location of the opening brace of the function. So you should carefully check that all of the function's local variables are initialised properly.

0
On

Okay, I debugged the problem.

The issue is that the PPC64 code in LibFFI contains cases for big endian which don't match my expectations.

I applied this test patch:

--- a/src/powerpc/linux64_closure.S
+++ b/src/powerpc/linux64_closure.S
@@ -27,7 +27,8 @@
 #define LIBFFI_ASM
 #include <fficonfig.h>
 #include <ffi.h>
-
+#undef __LITTLE_ENDIAN__
+#define __LITTLE_ENDIAN__ 1
        .file   "linux64_closure.S"

 #ifdef POWERPC64

and all my tests pass. What __LITTLE_ENDIAN__ controls is conditionally included code blocks like this:

# case FFI_TYPE_INT
# ifdef __LITTLE_ENDIAN__
        lwa %r3, RETVAL+0(%r1)
# else
        lwa %r3, RETVAL+4(%r1)
# endif
        mtlr %r0
        addi %r1, %r1, STACKFRAME
        .cfi_def_cfa_offset 0
        blr
        .cfi_def_cfa_offset STACKFRAME

The client code, on big endian, is expected to displace the return value being stored, so that it is aligned with the top of an 8 byte word.

So to store an int (four byte), the code is expected to do *(int *)(retptr+4) = val and not simply *(int *)retptr = val as my code is doing.

It seems that the expectation is that the application is supposed to store an 8 byte word into the return value regardless of the FFI type: be it a char, short, int or (64 bit) long. That is to say:

(int64_t)retptr = val; / val is char, short, whatever */

This way the least significant byte of the value is at retptr + 7, and so that address is used if the actual type is char; retptr + 6 is used if it is short and so on. The FFI code makes sense this way. The problem is that it is inconvenient and inconsistent; the FFI arguments are not required to be treated that way.

For instance the int argument in the following call isn't displaced by 4 bytes; it is just written to the base address of the buffer given to libffi

This is the TXR Lisp interactive listener of TXR 176.
Use the :quit command or type Ctrl-D on empty line to exit.
1> (with-dyn-lib nil (deffi printf "printf" int (str : int)))
#:lib-0185
2> (printf "foo %d\n" 1)
foo 1 
0

But, oh look; the return value is bogus! Foreign function call return values have a similar problem.

It looks like I was fooled by an example in some libffi documentation, namely this one:

 #include <stdio.h>
 #include <ffi.h>

 int main()
 {
   ffi_cif cif;
   ffi_type *args[1];
   void *values[1];
   char *s;
   int rc;

   /* ... abbreviated ... */
       s = "This is cool!";
       ffi_call(&cif, puts, &rc, values);
       /* rc now holds the result of the call to puts */

   /* ... */
 }

Turns out, this is not correct; some other libffi documentation says that return values must be captured using the type ffi_arg (which, confusingly, is not used for arguments). So the above sample should, I think, be doing something like this:

ffi_arg rc_buf;
int rc;
/*...*/
s = "Turned out uncool, but we promise this is really cool now!";
ffi_call(&cif, puts, &rc_buf, values);
rc = (int) rc_buf;