ExecutionSetupException: One or more nodes lost connectivity during query

376 Views Asked by At

While running a query on Dremio 4.6.1 installed on Kubernetes, we are getting the following error message from Dremio UI:

ExecutionSetupException: One or more nodes lost connectivity during query. Identified nodes were [dremio-executor-2.dremio-cluster-pod.dremio.svc.cluster.local:0].

Dremio-env config has the following settings: DREMIO_MAX_DIRECT_MEMORY_SIZE_MB=13384 DREMIO_MAX_HEAP_MEMORY_SIZE_MB is not set We are using workers of 16G /8c (Total of 10 workers) 1 Master Coordinator with the same config Zookeeper with 1G/ 1c

Any idea what's causing this behavior ?

By doing a live logs tail before the worker crashes here are the logs:

An irrecoverable stack overflow has occurred.
Please check if any of your loaded .so files has enabled executable stack (see man page execstack(8))
# A fatal error has been detected by the Java Runtime Environment:
#  SIGSEGV (0xb) at pc=0x00007f41cdac4fa8, pid=1, tid=0x00007f41dc2ed700
# JRE version: OpenJDK Runtime Environment (8.0_262-b10) (build 1.8.0_262-b10)
# Java VM: OpenJDK 64-Bit Server VM (25.262-b10 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  0x00007f41cdac4fa8
# Core dump written. Default location: /opt/dremio/core or core.1
# An error report file with more information is saved as:
# /tmp/hs_err_pid1.log
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.

[error occurred during error reporting , id 0xb]


There are 0 best solutions below