Jobserver 0.7.0 it have 4Gb ram available and 10Gb for the context, the system have 3 more free Gb. The context was running for a while and at the time when receive a request fails without any error. The request is the same like other ones that have processed while it was up, is not a special one. The following log corresponds to the jobserver log and as you can see, the last successfully job was finished at 03:08:23,341
and when receive the next one then the driver command a shutdown.
[2017-05-16 03:08:23,340] INFO output.FileOutputCommitter [] [] - Saved output of task 'attempt_201705160308_0321_m_000199_0' to file:/value_iq/spark-warehouse/spark_cube_users_v/tenant_id=7/_temporary/0/task_201705160308_0321_m_000199
[2017-05-16 03:08:23,340] INFO pred.SparkHadoopMapRedUtil [] [] - attempt_201705160308_0321_m_000199_0: Committed
[2017-05-16 03:08:23,341] INFO he.spark.executor.Executor [] [] - Finished task 199.0 in stage 321.0 (TID 49474). 2738 bytes result sent to driver
[2017-05-16 03:39:02,195] INFO arseGrainedExecutorBackend [] [] - Driver commanded a shutdown
[2017-05-16 03:39:02,239] INFO storage.memory.MemoryStore [] [] - MemoryStore cleared
[2017-05-16 03:39:02,254] INFO spark.storage.BlockManager [] [] - BlockManager stopped
[2017-05-16 03:39:02,363] ERROR arseGrainedExecutorBackend [] [] - RECEIVED SIGNAL TERM
[2017-05-16 03:39:02,404] INFO k.util.ShutdownHookManager [] [] - Shutdown hook called
[2017-05-16 03:39:02,412] INFO k.util.ShutdownHookManager [] [] - Deleting directory /tmp/spark-556033e2-c456-49d6-a43c-ef2cd3494b71/executor-b3ceaf84-e66a-45ed-acfe-1052ab1de2f8/spark-87671e4f-54da-47d7-a077-eb5f75d07e39
The Spark Worker server just log the following:
17/05/15 19:25:54 INFO ExternalShuffleBlockResolver: Registered executor AppExecId{appId=app-20170515192550-0004, execId=0} with ExecutorShuffleInfo{localDirs=[/tmp/spark-556033e2-c456-49d6-a43c-ef2cd3494b71/executor-b3ceaf84-e66a-45ed-acfe-1052ab1de2f8/blockmgr-eca888c0-4e63-421c-9e61-d959ee45f8e9], subDirsPerLocalDir=64, shuffleManager=org.apache.spark.shuffle.sort.SortShuffleManager}
17/05/16 03:39:02 INFO Worker: Asked to kill executor app-20170515192550-0004/0
17/05/16 03:39:02 INFO ExecutorRunner: Runner thread for executor app-20170515192550-0004/0 interrupted
17/05/16 03:39:02 INFO ExecutorRunner: Killing process!
17/05/16 03:39:02 INFO Worker: Executor app-20170515192550-0004/0 finished with state KILLED exitStatus 0
17/05/16 03:39:02 INFO Worker: Cleaning up local directories for application app-20170515192550-0004
17/05/16 03:39:07 INFO ExternalShuffleBlockResolver: Application app-20170515192550-0004 removed, cleanupLocalDirs = true
17/05/16 03:39:07 INFO ExternalShuffleBlockResolver: Cleaning up executor AppExecId{appId=app-20170515192550-0004, execId=0}'s 1 local dirs
And the Master log:
17/05/16 03:39:02 INFO Master: Received unregister request from application app-20170515192550-0004
17/05/16 03:39:02 INFO Master: Removing app app-20170515192550-0004
17/05/16 03:39:02 INFO Master: 157.97.107.150:33928 got disassociated, removing it.
17/05/16 03:39:02 INFO Master: 157.97.107.150:55444 got disassociated, removing it.
17/05/16 03:39:02 WARN Master: Got status update for unknown executor app-20170515192550-0004/0
Before receiving this request spark wasn't executing any other job, the context was using 5,3G/10G and the driver 1,3G/4G.
What meas "Driver commanded a shutdown"?
There is any log property that can be changed to see more details on the logs?
How can a simple request just break the context?