Is there a Lambda function that can initiate the execution of an EMR Studio notebook containing a PySpark script (.py) performing basic operations such as reading from and writing to an S3 bucket on an EMR cluster?
I tried the below 2 functions
- response = emr_client.start_notebook_execution(EditorId=notebook_id,RelativePath='test_pyspark.py',ExecutionEngine={'Id': cluster_id},ServiceRole='EMR_Notebooks_DefaultRole')
- steps = [ { 'Name': 'Test_Space', 'ActionOnFailure': 'CONTINUE', # Define action on failure 'HadoopJarStep': { 'Jar': 'command-runner.jar', 'Args': [ 'spark-submit', '--deploy-mode', 'cluster', '--master', 'yarn', 's3://aws-emr-resources-us-east-1/test_pyspark.py' ] } } ] and I expected to run the script and save results in S3. But no results were saved in S3.