CLinker.toCString replacement in Java 18

224 Views Asked by At

Java 16, as part of incubating package jdk.incubator.foreign, used to provide convenient way to convert Java Strings to C strings of arbitrary Charset using MemorySegment CLingker.toCString​(String str, Charset charset, NativeScope scope). That method was removed since Java 17. Is there currently a convenient method to convert Java String to C string of selected Charset?

Java 18 has void MemorySegment.setUtf8String(long offset, String str). However that obviously only supports UTF8.

2

There are 2 best solutions below

1
On

On JDK18 I use a conversion of (s+"\0") which typically adds 1, 2 or 4 bytes as null termination to the end of the MemorySegment for the C string - depending on the character set used:

static MemorySegment toCString(SegmentAllocator allocator, String s, Charset charset) {
    // "==" is OK here as StandardCharsets.UTF_8 == Charset.forName("UTF8")
    if (StandardCharsets.UTF_8 == charset)
        return allocator.allocateUtf8String(s);

    return allocator.allocateArray(ValueLayout.JAVA_BYTE, (s+"\0").getBytes(charset));
}

Windows Java -> Wide string is then: toCString(allocator, s, StandardCharsets.UTF_16LE)

Hopefully someone can offer a more efficient / robust way to convert. The above works for round-trip tests I've done on a small group of character sets (Windows + WSL), but I'm not confident it is reliable in all situations.

0
On

I use this snippet to convert strings to UTF-16:

private static MemoryAddress string(String s, ResourceScope scope) {
    if (s == null) {
        return MemoryAddress.NULL;
    }
    byte[] data = s.getBytes(StandardCharsets.UTF_16LE);
    MemorySegment seg = MemorySegment.allocateNative(data.length + 2, scope);
    seg.copyFrom(MemorySegment.ofArray(data));
    return seg.address();
}

Note that the tailing null character takes 2 bytes in UTF-16 - if you use a different encoding, you may need to modify the string before (s + '\000').

UTF-16 is good enough for my purposes - calling the Windows API.