I need to install AWS CLI tool on Google Cloud Composer to be able to use it with BashOperator from Airflow DAGs.
AWS CLI documentation page explains how to install it as a package, but Cloud Composer doesn't have a supported way to install apt packages on all instances.
My motivation. I need to synchronize a large S3 bucket with another storage. The command aws s3 sync
(link) suits perfectly for this. Unfortunately, I didn't find a replacement for this command in Airflow Amazon provider operators. Also, it seems this command is not supported by boto and boto3 (github issue 1, issue 2).
To install the AWS CLI on Cloud Composer, you can use the following steps:
Replace
AWS_REGION
with the name of a supported AWS region.Once the DAG has run, the AWS CLI will be installed on all of the Airflow worker nodes in your Cloud Composer environment. You can then use the AWS CLI in your Airflow DAGs to interact with AWS services.
Note that you will need to have the
pip
Python package installed on your Cloud Composer environment in order to use the above steps. You can installpip
using the following command:Here is an example of a complete Airflow DAG that installs the AWS CLI and configures it with your AWS credentials:
Once you have saved the DAG, you can run it using the following command:
Once the DAG has run, the AWS CLI will be installed on all of the Airflow worker nodes in your Cloud Composer environment. You can then use the AWS CLI in your Airflow DAGs to interact with AWS services.