Serialization of java object to String

58 Views Asked by At

Definition of Serialization

Serialization is a mechanism of converting the state of an object into a byte stream.

Converting an java object to string is still called as serialization ?

 String json = objectMapper.writeValueAsString(request);

In the above code we usually refer the process as serialization even though the end response is string.

Can someone make me understand what exactly serialization is ?

2

There are 2 best solutions below

0
rzwitserloot On

A string can be trivially converted to a byte stream - simply pick an encoding and off you go:

byte[] data = "Hello, World!".getBytes(StandardCharsets.UTF_8);

will turn a string into bytes. Or, if you have a series of strings you need to write to an output stream, just as simple:

OutputStream out = ....; // some place you need to send bytes, but you have strings.
OutputStreamWriter writer = new OutputStreamWriter(out, StandardCharsets.UTF_8);
writer.write("Hello!");
writer.write("World!");

In fact, most languages blur the distinction and treat byte[] and String as either the same data type, or allow the one when the other is the more appropriate thing. Python and javascript do this, for example. Java itself does not, but loads of places that are byte-based have override methods that accept a string and even allow you not to specify a charset encoding. In which case (as converting a string to bytes or vice versa always involves a charset encoding) java picks some default. I suggest you never use these methods, it's a lot easier to understand what is happening if you include the charset explicitly, as all examples in this answer are doing (they all use UTF_8).

In other words, while that is an oversimplification that leads to a ton of problems, loads of programmers hear 'string' and think 'bytes' or vice versa and consider the two interchangible. Conflating the two is a stupid idea but it's so common, I'm not at all surprised you've read something that shows how data becomes a string and then just concludes: "Voila - serialized!".

There is a separate and equally sufficient explanation: The ambiguity of english.

Code is highly specific. Some code does a thing, and you can confirm it by running it. It's meaning is entirely unambiguous. You are entitled to opinions, but those are trivially shown to be just wrong. For example, if, given:

System.out.println("Hello!");

If someone says about this code: In my view, a simple print statement should not print a newline, so, therefore, I consider the above code as not adding a newline - okay, that might be your opinion, but it's trivially provably as just plain wrong. Because println does, obviously, print that newline. In contrast, a statement like:

"In java, classes can contain functions"

is not nearly that clear cut. One could make a plausible argument that the above statement is false, as java does not have functions (it has methods). But, we don't ask a computer to compile and execute the statement "In java, classes can contain functions". We therefore cannot resort to saying: That is incorrect because when I run this code, the thing you said would happen didn't happen.

English serves to communicate ideas. And "Serialization" is an idea. One book defines serialization as turning arbitrary objects into byte arrays. In the words of a great sage, Yeah, well, you know, that's just like, uh, your opinion, man! - somebody else says that to them 'serialization' means "Converting arbitrary objects into a form that is trivially sent through channels in universal terms", and strings (as well as ints, longs, bytes of course, and a few other types) qualify for that definition in most folks' estimations.

So, the author of the first statement has defined 'serialization' as byte arrays, the second one uses a slightly different definition of that word and considers strings to be included. Serialization is not in a spec that is universally accepted by every person on the planet, even restricting it to the programming world. Even if you restrict to just the domain of java, it's not a word that is so universally well defined that we can safely state 'if somebody uses that word, you may safely assume they are specifically talking about ending up at a stream of byte values'.

0
nik0x1 On

I would start with the problem we are trying to solve. This a problem is to present an object in a form suitable for subsequent storage or transfer, with the subsequent possibility of its restoration.

The process of translating an object into this form is called serialization. The process of restoring an object from a serialized representation is called deserialization.

The format of the serialized object can be json, xml, protobuf, your own format and so on.