In the book Java Concurrency In Practice by Brian Goetz, there is a section about safe construction practices. The author covers how letting the "this" reference escape during construction can lead to an "incompletely constructed object".
But an object is in a predictable, consistent state only after its constructor returns, so publishing an object from within its constructor can publish an incompletely constructed object. This is true even if the publication is the last statement in the constructor. If the this reference escapes during construction, the object is considered not properly constructed.8
- More specifically, the this reference should not escape from the thread until after the constructor returns.
I'm looking for an example program that deliberately let's the this reference escape to another thread. The example should give an "unexpected" output because an incompletely constructed object is accessed. Importantly, the publication of this should be the last statement in the constructor!
Can anyone provide an example program which prints an "unexpected" value in practice?
I've tried this program:
public class ThisEscape {
private int state;
public ThisEscape() {
state = 42;
try {
Thread t = new Thread(() -> System.out.println(state)); // this escapes here
t.start();
t.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
new ThisEscape();
}
}
Here the this reference escapes to the anonymous class that implements Runnable (created using a lambda expression). The program prints 42 when I run it, but if my understanding is correct, it could theoretically print 0 or perhaps another value != 42.
Incorrect. The JMM talks about observability and allows for the JVM caching (or not) any variable. It also allows for shearing, which you will never be able to observe on any 64-bit JVM implementation I'm aware of; you'll need to download a 32-bit JVM to try to observe it. Shearing is seeing half of an update to a 64-bit variable. The only variable types in java that are defined as 64-bit are doubles and longs. Other types may also occupy 64 bit. However, it is not valid for a JVM to allow observation of shearing on anything but
longanddoublevalues. However, other than shearing, a JVM is not allowed to make up numbers. In other words, in your example, you can observe 0, and you can observe 42. Nothing else can be observed (not even when these arelong, due to 0 and 42 both having all zero bits for their upper 32 bit block). Or rather, if you manage to observe any other value, your JVM is broken: It breaks the JVM Specification.Shearing
As I said, it's mostly irrelevant, in that no current JVM impl will ever do it, but the spec still legally allows a JVM to shear
longanddoublevalues. Not any other type of value (not even references, even though on 64-bit hardware those tend to be 64 bit too):given:
And having 'run' continually flip
xfrom 0 to -1, callingdescribeCANNOT returnHUH??. If you have a JVM that does that, it is broken.However, make that
intalongand a JVM is legally allowed to print HUH?? here. Specifically, two more values are now allowed to be observed: -1 is 64 one-bits, and 0 is 64 zero-bits. The 2 extra observable values are the sheared variants: 32 zero-bits followed by 32 one-bits (which, in long terms, is 4294967295), or 32 one-bits followed by 32 zero-bits (which, in long bits, is -4294967296).The reason should be obvious:
x = 0L;wherexis a long (a 64-bit value) on 32-bit hardware requires two write ops: One to write 32 zero bits to the 'top' and another to write 32 zero bits to the 'bottom', and the JVM gives itself the freedom to allow other code to see those bits halfway through that operation, and gives itself the freedom to order those writes (top first, or bottom first) either way - whatever is more efficient on the hardware your JVM runs on. That's the reason for all these 'the JMM may or may not do X' - it's to allow a JVM to pick whatever is the most efficient thing on a given architecture/OS combo.A snippet that shows the danger
... probably doesn't exist.
I explained the shearing situation to hopefully make clear what's happening here:
longordoubleon 32-bit JVM implementations incredibly expensive - the JVM would have to lock on something for every write regardless of whether it will end up mattering or not just to adhere to a hypothetical 'no shearing can be observed' rule.long. CPUs update registers and memory positions atomically. You observe pre-update or post-update, you can't observe sheared values because the CPU just does not work that way.longvalues can be sheared) that is literally impossible to trigger on most JVM implementations.Similar situations occur with
strictfp. You will not be able to write code where you can observe that keyword making any difference for the vast, vast majority of arch+JVMimpl tuples.I wouldn't be surprised if it isn't possible to write code that lets you observe a problem with letting constructors leak their
thisreference. The JMM merely reserves the right that some day, on some architecture, under some phase of the moon, if code can be executed faster but this requires foregoing any guarantee about syncing any writes a constructor did to other threads if that constructor leaked its ref, then a JVM implementor is free to do this. If that breaks your app, your app is bugged. It always was - it was just a bug that until that moment was not possible to trigger. You can file bugs with that JVM implementor all day, they'll close em as 'nope, you messed up because the JVM is working as designed'.Repercussions
The way the JMM section is written, where JVMS may or may not do a certain thing, means multicore field access is Here be dragons territory. Ordinarily, you'd write some tests and these will ensure your code does what it wants. Ordinarily, if you're trying to figure out how stuff works, you write some experimental stuff and play around to learn.
Neither plan works, at all, in multicore. A JVM that says "I may or may not do X" is nothing like a coin - that doesn't mean that randomly it'll do it one run and do the other the next run. It tends to mean that it'll do the same thing every run today and tomorrow, but then next week it might do the other thing. Or even do one thing reliably on computer A and the other thing reliably on computer B. Testing / playing around is overwhelmingly unlikely to find these differences. A select few things really do let you observe both things more or less at 50%/50% chance but that's rare.
Given that it's untestable and you can't learn by playing, how is one to learn how to write proper, safe, reliable, bug-free multicore code? Mostly you can't - this is why you should think thrice before doing it.
Instead, use mechanisms that are much, much easier to test and understand. For example, instead of having 2 threads communicate with each other by way of field writes (which requires synchronization and is impossible to properly test), communicate with a database. transactions are much easier to understand. Or use a message queue perhaps.
Alternative to that, just.. eliminate all need for communications. Take, for example, a web server. It can trivially serve 50 connections simultaneously on 50 different threads and if these threads don't need to communicate with each other (and why would they need this? It's a webserver - unless that is powering some sort of chat platform, generally there's no need for that), then trivially there are no synchronization issues and these 'a JVM may or may not, dealer's choice' things are irrelevant to you.
The thing to remember is: Any variable that is written to and accessed by more than one thread is a ticking time bomb. Avoid like the plague. Whatever you can do not to have that - do that. The maintenance cost of writing code that needs to work like that is literally 10x to 100x higher because the only real way to 'test' it is to have someone extremely experienced at this go through it 'by hand' to ascertain it works properly. A simple test cannot do this, and you can't turn into an experienced hand simply by playing around with such code. You need to be able to quote chapter and verse of the JVMS to be capable of doing this.