spark submit job fails in airflow but works in container

25 Views Asked by At

I'm running a spark-submit job with Airflow and it fails but when I docker exec into my spark worker container, it works perfectly fine.

I have set the connection to the spark master in Airflow UI but when I trigger the DAG, it fails at reading a parquet file (I get an UNABLE_TO_INFER_SCHEMA kind of error). When I docker exec into the container, the file does not even exist.

Do you have an explanation on why it behaves differently when running on Airflow vs running directly into the container?

0

There are 0 best solutions below