Spark 3.2 on Kubernetes keeps throwing okhttp3/okio EOFException

231 Views Asked by At

I'm using Spark 3.2.1 image that was built from the official distribution via `docker-image-tool.sh', on Kubernetes 1.18 cluster. Everything works fine, except for this error message every 90 seconds:

 WARN WatcherWebSocketListener: Exec Failure
 java.io.EOFException
    at okio.RealBufferedSource.require(RealBufferedSource.java:61)
    at okio.RealBufferedSource.readByte(RealBufferedSource.java:74)
    at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117)
    at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101)
    at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274)
    at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214)
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203)
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

This error message does not effect the application, but it's really annoying, especially for Jupyter users, and the lack of details makes it very hard to debug.

It appears on any submit variation - spark-submit, pyspark, spark-shell, and regardless to dynamic execution enabled or disabled.

I've found traces of it on the internet, but all occurrences were from older versions of Spark and resolved by using "newer" version of fabric8 (4.x). Spark 3.2.1 already use fabric8 5.4.1.

I wonder if anyone else still sees this error in Spark 3.x, and has a resolution.

Thanks.


Update: This seems to be related to the Kubernetes cluster itself. After migrating to a new cluster this error was gone.

0

There are 0 best solutions below