I'm running a spark-submit job with Airflow and it fails but when I docker exec into my spark worker container, it works perfectly fine.
I have set the connection to the spark master in Airflow UI but when I trigger the DAG, it fails at reading a parquet file (I get an UNABLE_TO_INFER_SCHEMA kind of error). When I docker exec into the container, the file does not even exist.
Do you have an explanation on why it behaves differently when running on Airflow vs running directly into the container?