I have setup a Linux workstation where I plan to use Spark ML with Scala and XGBoost (0.90). I have been working with a similar configuration on Windows until now. Same versions for Spark, Java etc. but for XGBoost I have been using the criteo fork version 0.81.
Both configurations are working. The problem I am facing is that for the exact same data set XGBoost on Windows would take a few minutes (2-3 min) to calculate the train set but on Linux it would take 20 minutes.
The strange thing is when I run XGBoost on Windows the CPU load is at 100% where it is just at 5-10 % with the Linux workstation.
I would stick with the Windows workstation but it keeps crashing from time to time so I thought to be on the "safe" side with Linux.