I have been trying to deploy the Git Sync DAG (v3.4.0) to my instance of Airflow (v2.4.1 with helm chart version 1.7.0) running on a kubernetes cluster (v1.23.7+rke2r2).
I followed the deployment instructions from the Airflow documentation which can be found here.
My override_values.yaml
is the following.
dags:
gitSync:
enabled: true
repo: [email protected]/MY_COMPANY_NAME/MY_COMPANY-dags.git
branch: main
subPath: ""
sshKeySecret: airflow-ssh-secret
extraSecrets:
airflow-ssh-secret:
data: |
gitSshKey: 'MY_PRIVATE_KEY_IN_B64'
Once airflow is stable, I use the following helm
command to update my airflow deployment.
helm upgrade --install airflow apache-airflow/airflow --namespace airflow -f override-values.yaml
This succeeds, but the deployment never achieves a new stable state with the git-sync
containers. The git-sync-init
repeatedly fails to complete. I have previously used this approach to deploy git-sync
and it worked for months, however it stopped working suddenly. When I attempt to check the logs for the git-sync-init
container, they are empty and there doesn't seem to be a verbosity attribute I can enable.
After reading through github issues on the git-sync
repo, I also attempted to prepend the ssh://
scheme to the repo url, but that did not fix the issue.
Is there an alternative way for me deploy a git-sync
sidecar container to my airflow deployment so that I can access code from private repos?
EDIT:
It appears like the issue was actually with the rancher GUI. Whenever I would use the GUI, the container logs and shell would not load or show anything. However, I was able to open up a kubectl
shell, query for the airflow pods with kubectl get pods -n airflow
, and query for the specific init
container logs with ubectl logs airflow-scheduler-65fcdbb58d-4pnzf git-sync -n airflow
.
This yielded the following error.
"msg"="unexpected error syncing repo, will retry" "error"="Run(git submodule update --init --recursive --depth 2): exit status 128: { stdout: "", stderr: "fatal: No url found for submodule path 'COMPANY_NAME/PACKAGE_PATH/PACKAGE' in .gitmodules\n" }"
This pointed to a misconfigured .gitmodules
that was not updated when the structure of our dag
repo was changed.