My employer has a business need to make Java builds byte-for-byte reproducible. I am aware of the difficulties in making JAR files reproducible (due to archiving order and time stamps), but at this point I’m talking about class files.
I have builds of the same code using Java 8u65, both on Mac and on Linux. The class files are binarily different. Both classes decompile back to the same source; to see the difference requires the javap disassembler.
The source code seems to be:
final TrustStrategy acceptingTrustStrategy =
(X509Certificate[] chain, String authType) -> true;
On one build, the result is:
private static boolean lambda$restTemplate$38(java.security.cert.X509Certificate[], java.lang.String) throws java.security.cert.CertificateException;
Code:
0: iconst_1
1: ireturn
On the other, it is:
private static boolean lambda$restTemplate$15(java.security.cert.X509Certificate[], java.lang.String) throws java.security.cert.CertificateException;
Code:
0: iconst_1
1: ireturn
Anonymous lambdas are getting names with different numbers in them (lambda$restTemplate$15
versus lambda$restTemplate$38
).
It appears that, when I rebuild on the same host, I get the same bytes. When the host differs, the numbers change; two Linux hosts produced different bytes.
What determines these numbers? Is there a way to force every compilation to use the same numbers in this place, and thus produce the same class files? Or is Java 8 class file compilation indeterministic?
I haven't looked into it too much, but this article talks about reproducible builds in Java, and reproducible-builds has some tools to try to help making builds (and classes) reproducible.
The link you're probably looking for is the Reproducible Build Maven Plugin, made specifically for Java to try to "strip non-reproducible data from the generated artifacts".