Recently we are upgraded to composer 2 version from composer 1.
Composer version- 2.6.2
Airflow version- 2.5.3
initially we were using below BaseHook library and cloud sql proxy to get connect to postgres cloud sql.
from airflow.hooks.base_hook import BaseHook
def sql_alchemy_engine(conn_id):
return BaseHook.get_connection(conn_id).get_hook().get_sqlalchemy_engine(engine_kwargs={'echo': False})
below are details of our cloud sql proxy config files:
**test_edl-cloud-sql-proxy.yml**
apiVersion: apps/v1
kind: Deployment
metadata:
name: edl-cloud-sql-proxy
namespace: default
spec:
selector:
matchLabels:
app: edl-cloud-sql-proxy
template:
metadata:
labels:
app: edl-cloud-sql-proxy
spec:
containers:
- command:
- /cloud_sql_proxy
- -dir=/cloudsql
- -instances=test-project:europe-west1-d:cloud-sql-name=tcp:0.0.0.0:5432
- -ip_address_types=PRIVATE
- term_timeout=10s
image: gcr.io/cloudsql-docker/gce-proxy:1.14
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
exec:
command:
- sleep
- "10"
name: edl-cloud-sql-proxy
ports:
- containerPort: 5432
name: port-postgres
protocol: TCP
resources: {}
securityContext:
allowPrivilegeEscalation: false
runAsUser: 2
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /cloudsql
name: cloudsql
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: cloudsql
**test_mapp-sqlproxy-service.yml**
apiVersion: v1
kind: Service
metadata:
name: mapp-sqlproxy-service
spec:
type: ClusterIP
selector:
app: edl-cloud-sql-proxy
ports:
- port: 5432
protocol: TCP
targetPort: port-postgres
we than applied these ymls to create services:
kubectl apply -f test_edl-cloud-sql-proxy.yml
kubectl apply -f test_mapp-sqlproxy-service.yml
xyz@test-cloudbuild-bastion:~$ kubectl get services --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
composer-system airflow-monitoring-service ClusterIP 10.125.54.181 <none> 8125/UDP,8126/UDP,4317/TCP,4318/TCP 6d22h
composer-system airflow-redis-service ClusterIP 10.125.31.163 <none> 6379/TCP 6d22h
composer-system custom-metrics-stackdriver-adapter ClusterIP 10.125.9.151 <none> 443/TCP 6d22h
default kubernetes ClusterIP 10.125.0.1 <none> 443/TCP 6d23h
default mapp-sqlproxy-service ClusterIP 10.125.193.169 <none> 5432/TCP 3h4m
gke-gmp-system alertmanager ClusterIP None <none> 9093/TCP 6d23h
gke-gmp-system gmp-operator ClusterIP 10.125.36.23 <none> 8443/TCP,443/TCP 6d23h
kube-system antrea ClusterIP 10.125.50.137 <none> 443/TCP 6d23h
kube-system default-http-backend NodePort 10.125.222.139 <none> 80:32348/TCP 6d23h
kube-system kube-dns ClusterIP 10.125.0.10 <none> 53/UDP,53/TCP 6d23h
kube-system metrics-server ClusterIP 10.125.177.226 <none> 443/TCP 6d23h
xyz@test-cloudbuild-bastion:~$ psql -h mapp-sqlproxy-service.default.svc.cluster.local --user seed-cloudbuild
psql: could not translate host name "mapp-sqlproxy-service.default.svc.cluster.local" to address: Temporary failure in name resolution
Also, our dag was also not able to establish the connection:
File "/opt/python3.8/lib/python3.8/site-packages/psycopg2/__init__.py", line 122, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
we came to know that base_hook has been deprecated in airflow 2 version, so we changed our code as below:
from airflow.hooks.postgres_hook import PostgresHook
def sql_alchemy_engine(conn_id):
postgres_hook = PostgresHook(conn_id)
engine = postgres_hook.get_sqlalchemy_engine()
return engine
However, we still not able to connect via cloud sql proxy, but we are able to connect if we give cloud sql IP directly. We are facing extremly slow connection to cloud sql now. We are not sure why it is happening. is it because we are not connecting via cloud sql proxy now.
what is the advantage of connecting via cloud sql proxy or via cloud sql IP directly?
