In go, when I use json.Marshal on []byte & then json.Unmarshal inside a []byte I will get the same []byte that I used in input.
But when I json.Unmarshal inside an interface{} I will get a string.
Example here: https://goplay.tools/snippet/5BfFZ-Uq507
I've read json.Unmarshal documentation (https://pkg.go.dev/encoding/json#Unmarshal) & this issue https://github.com/golang/go/issues/16815.
I understand that []byte and string are not the same type and that it's logical to have a different result then string([]byte("BOOKS")) if I tried to json.Unmarshal inside a string.
But since I unmarshaled into interface{}, I expected the type to be []byte and to have my original []byte back not a string.
This is a problem for me because I can't make a difference, when unmarshalling data into map[string]interface{}, between what was originally a string or a []byte.
Example: https://goplay.tools/snippet/MVSR7_MvSv-
Is there any way to solve my issue ?
I initially left a comment because this seemed like a trivial issue, although the questions you're asking and things you mention suggest that there's actually a fair few things to unpack.
[]byte
not a string.What are these types, let's start with that. As per spec, the
byte
type is an alias foruint8
A string is effectively a sequence of bytes, so therefore a string is a sequence of
uint8
values. It is its own type, but let's take a closer look:With this in mind, you can see that a string can be copied and cast safely to a
[]byte
, but the main difference here is that astring
is immutable, whereas a[]byte
is not:This is all to say that, for the purposes of marshalling something, the input is immutable, and therefore there is no difference between
[]byte
andstring
.Cool, but didn't I just say that
[]byte
is an alias for[]uint8
. Correct, so at this point you'd still expect[]byte
to be encoded as[1, 2, 3, 4, ...]
. So let's take a look at the source code of theencoding/json
package, in particular this line stands outNotice the comment: Byte slices get special treatment, which returns
encodeByteSlice
as anencoderFunc
. Clearly, we are returning a different encoder callback when dealing with a slice of bytes, so let's look at what that encoder function looks like...And there we have it: a byte slice is handled specifically to write the values to the buffer delimited by
"
, meaning the values will be encoded as a JSON string. Just like that, we can perfectly explain the behaviour you've observed:[]byte
, which is valid[]uint8
) is treated as a special caseNow when it comes to unmarshalling, what's going on with your
var dataAny any
case? Well, let's look at the source code for the unmarshalling, specifically this partThis covers both of your unmarshal cases quite nicely. The JSON encoded input starts with a
"
, so we enter the case that deals with unmarshalling strings. We get the data from the input minus the quotes as a slice of bytes (unquotedBytes()
). Next, we check what type the destination (v
) for the unmarshalled data is. We accept 3 types:uint8
, we return an error (meaning we only really accept[]byte
)any
, or interface typeIf the destination is of type
any
, we do a quick check to make sure that the underlying type truly is an empty interface (ie we're not trying to write data to something other than a literal empty interface), and if so, we callWe explicitly set its value to a string, because we are unmarshalling a string.
When the destination is a
[]byte
, we end up usingv.SetBytes(b[:n])
, so we copy the values over to a byte slice. Simple as can be.Now what you're actually looking for is a way to ensure that what is marshalled as a
[]byte
is unmarshalled as a[]byte
. From the code above, it should be fairly obvious by now that this can't be done. you can force something like this by converting your[]byte
to an[]int
:But that makes the marshalled data very silly. It's only really useful if both parties involved in the data-exchange know what to do with slices/arrays of numbers, and there are no cases where you actually want to send a slice of numeric values that shouldn't be interpreted as a string
This all leads in to the last point of note: you mentioned marshalling the data into a
map[string]any
. This makes me think we're dealing with an X-Y problem here.Sometimes, you're needing to unmarshal data which you can't know the type of (usually data you need to pass on to some other process that will be able to identify what the data means, and how to process it). In those (rare) cases, using a
map[string]any
can be a useful validation step to make sure you're not sending malformed payloads to that other process.However your trying to force a string to be represented as a
[]byte
suggests you're very much aware of what data you're dealing with, how it ought to be represented, what it means, and which fields need to be handled in this particular way. If that is the case: why bother with the wholemap[string]any
mess? For that to work/be used, you'll have to litter your code with hard-coded keys for the map to extract the bits of data you need. Just create a type, that implements the JSONMarshal and JSONUnmarshal methods, and you can handle specific fields in specific ways. You could, even though I'm still finding it impossible to think of a valid reason for it, convert strings to int slices and back again in the marshalling process