AWS Lambda to execute EMR Studio Notebook (PySpark) on EMR

55 Views Asked by Supriya R At 20 February 2024 at 23:05

Is there a Lambda function that can initiate the execution of an EMR Studio notebook containing a PySpark script (.py) performing basic operations such as reading from and writing to an S3 bucket on an EMR cluster?

I tried the below 2 functions

response = emr_client.start_notebook_execution(EditorId=notebook_id,RelativePath='test_pyspark.py',ExecutionEngine={'Id': cluster_id},ServiceRole='EMR_Notebooks_DefaultRole')
steps = [ { 'Name': 'Test_Space', 'ActionOnFailure': 'CONTINUE', # Define action on failure 'HadoopJarStep': { 'Jar': 'command-runner.jar', 'Args': [ 'spark-submit', '--deploy-mode', 'cluster', '--master', 'yarn', 's3://aws-emr-resources-us-east-1/test_pyspark.py' ] } } ] and I expected to run the script and save results in S3. But no results were saved in S3.

Original Q&A

AWS Lambda to execute EMR Studio Notebook (PySpark) on EMR

There are 0 best solutions below

Related Questions in PYSPARK

Related Questions in AWS-LAMBDA

Related Questions in AUTOMATION

Related Questions in AMAZON-EMR

Related Questions in AWS-EMR-STUDIO

Trending Questions

Popular # Hahtags

Popular Questions