Safety of using reflect.StringHeader in Go?

794 Views Asked by At

I have a small function which passes the pointer of Go string data to C (Lua library):

func (L *C.lua_State) pushLString(s string) {
    gostr := (*reflect.StringHeader)(unsafe.Pointer(&s))
    C.lua_pushlstring(L, (*C.char)(unsafe.Pointer(gostr.Data)), C.ulong(gostr.Len))
    // lua_pushlstring copies the given string, not keeping the original pointer.
}

It works in simple tests, but from the documentations it's unclear whether this is safe at all.

According to Go document, the memory of reflect.StringHeader should be pinned for gostr, but the Stringheader.Data is already a uintptr, "an integer value with no pointer semantics" - which is itself odd because if it has no pointer semantics, wouldn't the field be completely useless as the memory may be moved right after the value is read? Or is the field treated specially like reflect.Value.Pointer? Or perhaps there is a different way of getting C pointer from string?

1

There are 1 best solutions below

0
On

it's unclear whether this is safe at all.

Q4 2022: Tapir Liui (https://twitter.com/TapirLiu/) dans Go101 (https://github.com/go101/go101) gives a clue as to the "safety" of reflect.StringHeader in this tweet:

Since Go 1.20, the reflect.StringHeader and reflect.SliceHeader types will be depreciated and not recommended to be used.

Accordingly, two functions, unsafe.StringData and unsafe.SliceData, will be introduced in Go 1.20 to take over the use cases of two old reflect types.

That was initially discussed in CL 401434, then in issue 53003.

The reason for deprecation is that reflect.SliceHeader and reflect.StringHeader are commonly misused.
As well, the types have always been documented as unstable and not to be relied upon.

We can see in Github code search that usage of these types is ubiquitous.
The most common use cases I've seen are:

  • converting []byte to string:
    Equivalent to *(*string)(unsafe.Pointer(&mySlice)), which is never actually officially documented anywhere as something that can be relied upon.
    Under the hood, the shape of a string is less than a slice, so this seems valid per unsafe rule.
  • converting string to []byte:
    commonly seen as *(*[]byte)(unsafe.Pointer(&string)), which is by-default broken because the Cap field can be past the end of a page boundary (example here, in widely used code) -- this violates unsafe rule.
  • grabbing the Data pointer field for ffi or some other niche use converting a slice of one type to a slice of another type

Ian Lance Taylor adds:

One of the main use cases of unsafe.Slice is to create a slice whose backing array is a memory buffer returned from C code or from a call such as syscall.MMap.
I agree that it can be used to (unsafely) convert from a slice of one type to a slice of a different type.


Q4 2023: this has been closed/removed from proposal, and issue 53003 concludes with:

Why originally proposed functions StringToBytes and BytesToString are not added to unsafe package, so people stop writing various implementations with questionable correctness again and again is beyond my comprehension.

With the reply:

Because:

  • StringToBytes is now just unsafe.Slice(unsafe.StringData(s), len(s)) and
  • BytesToString is just unsafe.String(unsafe.SliceData(b), len(b)),

which are both very easy and there shouldn't be any implementations with questionable correctness.