AWS DataPipeline EMR cluster with spark

591 Views Asked by At

I have created an AWS DataPipeline using EMR template, but its not installing Spark on EMR cluster. Do I need to set any special action for that ? I see some bootstrapaction is need for spark installation but that is also not working.

1

There are 1 best solutions below

0
On

That install-spark bootstrap action is only for 3.x AMI versions. If you are using a releaseLabel (emr-4.x or beyond), the applications to install are specified in a different way.

When you are creating a pipeline, you click "Edit in Architect" at the bottom or edit your pipeline on pipelines home page then you can then click on the EmrCluster node and select Applications from the "Add an optional field..." dropdown. That is where you may add Spark.