How to use memory allocated by C malloc in Python (ctypes)

86 Views Asked by At

I guessed the following code should work, but it does not. Could someone explain to me how to use this kind of pointer? Of course, this is only a sample code. What I need to solve is to use a pointer that comes from an external library (it has its own allocator 'library_allocete_memory' which is internally using malloc). I need to copy data into that memory from python code.

import ctypes
libc = ctypes.cdll.msvcrt
data = b'test data'
size = len(data)
ptr = libc.malloc(size)
ptr = ctypes.cast(ptr, ctypes.c_char_p)
# ptr = ctypes.create_string_buffer(size) # this is working, but I cannot allocate memory this way in my case
ctypes.memmove(ptr, data, size) # access violation is thrown here
4

There are 4 best solutions below

0
CristiFati On

There are 2 things here that (might) generate Undefined Behavior:

  1. According to [Python.Docs]: ctypes - Loading dynamic link libraries (emphasis is mine):

    Note:      Accessing the standard C library through cdll.msvcrt will use an outdated version of the library that may be incompatible with the one being used by Python. Where possible, use native Python functionality, or else import and use the msvcrt module.

    Also mentioning [Python.Docs]: msvcrt - Useful routines from the MS VC++ runtime

  2. [SO]: C function called from Python via ctypes returns incorrect value (@CristiFati's answer). It's a common pitfall when working with CTypes (calling functions)

Here's a piece of code that works (in a slightly different scenario).

code00.py:

#!/usr/bin/env python

import ctypes as cts
import sys
from ctypes.util import find_msvcrt


def main(*argv):
    data = b"test data"
    size = len(data)
    ptr = None
    use_msvcrt = bool(argv)  # If an argument (any value) is given
    if use_msvcrt:
        print(f"UCRT (found by CTypes): {find_msvcrt()}")
        size += 1  # + 1 for the NUL terminating char (whoever uses the buffer might rely on it)
        procexp = 0  # Enable this to inspect the (current) Python process (loaded .dlls) via Process Explorer tool
        if procexp:
            input("Press <Enter> to load UCRT...")
        libc = cts.cdll.msvcrt
        if procexp:
            input("UCRT loaded, press <Enter> move forward...")
        malloc = libc.malloc
        malloc.argtypes = (cts.c_size_t,)
        malloc.restype = cts.POINTER(cts.c_char)  # Rigorously should be cts.c_void_p, but choosing this form for convenience
        free = libc.free
        free.argtypes = (cts.c_void_p,)
        free.restype = None
        ptr = libc.malloc(size)
    else:
        ptr = cts.create_string_buffer(size)
    if ptr is None:
        print("Uninitialized pointer")
        return -1
    print(f"Ptr: {ptr} ({type(ptr)})")
    cts.memmove(ptr, data, size)
    if use_msvcrt:
        if (not ptr):
            print("NULL pointer")
            return -2
        print(f"Ptr data: {b''.join(ptr[i] for i in range(size - 1))}")
        free(ptr)
    else:
        print(f"Ptr data: {bytes(ptr)}")


if __name__ == "__main__":
    print(
        "Python {:s} {:03d}bit on {:s}\n".format(
            " ".join(elem.strip() for elem in sys.version.split("\n")),
            64 if sys.maxsize > 0x100000000 else 32,
            sys.platform,
        )
    )
    rc = main(*sys.argv[1:])
    print("\nDone.\n")
    sys.exit(rc)

Output:

[cfati@CFATI-5510-0:e:\Work\Dev\StackExchange\StackOverflow\q078205030]> sopr.bat
### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###

[prompt]>
[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.10_test0\Scripts\python.exe" ./code00.py
Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] 064bit on win32

Ptr: <ctypes.c_char_Array_9 object at 0x00000239067AB6C0> (<class 'ctypes.c_char_Array_9'>)
Ptr data: b'test data'

Done.


[prompt]>
[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.10_test0\Scripts\python.exe" ./code00.py dummyarg
Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] 064bit on win32

UCRT (found by CTypes): None
Ptr: <ctypes.LP_c_char object at 0x000002143B88B6C0> (<class 'ctypes.LP_c_char'>)
Ptr data: b'test data'

Done.

Notes:

0
amirhm On

The code is crashing since malloc like that does not retun correct address, you could simply check like this:

import ctypes
import numpy as np
libc = ctypes.CDLL(None)
data = b'test data'
size = len(data)

## Maloc does not return correct address
ptr = libc.malloc(size)
print(f"malloc return address (not good) 0x{ptr:x}")
ptr = ctypes.cast(ptr, ctypes.c_char_p)

if you look at printed address you will see this is not correct address on heap as to be expected. but you could use other way to allocate on heap like here I do by numpy and successfuly copy to the allocate space:

buffer = np.zeros(size, dtype=np.uint8)
print(f"numpy allocate return address (good) 0x{buffer.ctypes.data:x}")
ptr = ctypes.cast(buffer.ctypes.data, ctypes.c_char_p)
 
ctypes.memmove(ptr, data, size) # successfully copies, since dest adr is valid
print(data)
print(buffer.view("c"))

UPDATE:

the return type of malloc was not correct, since a uint32 was returned. you could fix this by using the correct return type:

libc.malloc.restype = ctypes.c_void_p

now the malloc also will return correct pointer.

libc.malloc.restype = ctypes.c_uint64
ptr = libc.malloc(size)
ptr = ctypes.cast(ptr, ctypes.c_char_p)
libc.free(ptr)
ctypes.memmove(ptr, data, size) 
print(f"{data=} {ptr.value=}")
0
Mark Tolonen On

In your current example, libc.malloc.restype = ctypes.c_void_p is needed to make sure the return type is pointer-sized and not default c_int-sized (32-bit). On a 64-bit OS the returned 64-bit pointer will be truncated to 32-bit without it. libc.malloc.argtypes = ctypes.c_size_t, is also a good idea. Best to always declare .argtypes and .restype for functions called with ctypes.

Working example:

import ctypes as ct

libc = ct.CDLL('msvcrt')
libc.malloc.argtypes = ct.c_size_t,
libc.malloc.restype = ct.c_void_p
libc.free.argtypes = ct.c_void_p,
libc.free.restype = None

data = b'test data'
size = len(data)
ptr = libc.malloc(size)
ct.memmove(ptr, data, size)
print(ct.string_at(ptr, size))
libc.free(ptr)

Note also that msvcrt is C runtime DLL, but is not usually the one linked to by apps. Memory allocated by a C runtime must be released (freed) by that same runtime. As the OP mentioned:

What I need to solve is to use a pointer that comes from an external library (it has its own allocator 'library_allocete_memory'(sic) which is internally using malloc.

Make sure to only access that "pointer...from an external library" and send it back to that library to free it ('library_free_memory'??).

For a callback, be careful not to use ctypes.c_char_p or ctypes.c_wchar_p for the pointer type. Those are special ctypes types that convert the value to a Python bytes or str object, respectively, and the original pointer address is lost. Use a a ctypes.c_void_p (which you may have to cast depending on the data written to it) or a ctypes.POINTER type instead, and then the pointer is usable by ctypes.memmove or any other method of writing to a pointer.

0
masaj On

Thank you all! What was missing was just to set the return type for malloc or webui_malloc respectively by restype method:

libc.malloc.restype = ct.c_void_p