I am new to Pentaho.I am using MAPR distribution,When i am submitting a spark job,i am getting the below error.Please help me on this.I have done the necessary configuration for the integration of spark and pentaho.Please find the attached screenshots of the Pentaho Spark Submit job.
2017/09/12 12:41:44 - Spoon - Starting job...
2017/09/12 12:41:44 - spark_submit - Start of job execution
2017/09/12 12:41:44 - spark_submit - Starting entry [Spark Submit]
2017/09/12 12:41:44 - Spark Submit - Submitting Spark Script
2017/09/12 12:41:45 - Spark Submit - Warning: Master yarn-cluster is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead.
2017/09/12 12:41:45 - Spark Submit - SLF4J: Class path contains multiple SLF4J bindings.
2017/09/12 12:41:45 - Spark Submit - SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2017/09/12 12:41:45 - Spark Submit - SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2017/09/12 12:41:45 - Spark Submit - SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2017/09/12 12:41:45 - Spark Submit - SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2017/09/12 12:41:45 - Spark Submit - 17/09/12 12:41:45 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017/09/12 12:41:46 - Spark Submit - 17/09/12 12:41:46 ERROR MapRZKRMFinderUtils: Zookeeper address not configured in Yarn configuration. Please check yarn-site.xml.
2017/09/12 12:41:46 - Spark Submit - 17/09/12 12:41:46 ERROR MapRZKRMFinderUtils: Unable to determine ResourceManager service address from Zookeeper.
2017/09/12 12:41:46 - Spark Submit - 17/09/12 12:41:46 ERROR MapRZKBasedRMFailoverProxyProvider: Unable to create proxy to the ResourceManager null
2017/09/12 12:41:46 - Spark Submit - Exception in thread "main" java.lang.RuntimeException: Unable to create proxy to the ResourceManager null
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.getProxy(MapRZKBasedRMFailoverProxyProvider.java:135)
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.io.retry.RetryInvocationHandler$ProxyDescriptor.<init>(RetryInvocationHandler.java:195)
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:304)
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:298)
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.io.retry.RetryProxy.create(RetryProxy.java:58)
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:95)
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:73)
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:193)
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
2017/09/12 12:41:46 - Spark Submit - at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:152)
2017/09/12 12:41:46 - Spark Submit - at org.apache.spark.deploy.yarn.Client.run(Client.scala:1154)
2017/09/12 12:41:46 - Spark Submit - at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1213)
2017/09/12 12:41:46 - Spark Submit - at org.apache.spark.deploy.yarn.Client.main(Client.scala)
2017/09/12 12:41:46 - Spark Submit - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2017/09/12 12:41:46 - Spark Submit - at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2017/09/12 12:41:46 - Spark Submit - at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2017/09/12 12:41:46 - Spark Submit - at java.lang.reflect.Method.invoke(Method.java:498)
2017/09/12 12:41:46 - Spark Submit - at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:733)
2017/09/12 12:41:46 - Spark Submit - at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:177)
2017/09/12 12:41:46 - Spark Submit - at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:202)
2017/09/12 12:41:46 - Spark Submit - at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:116)
2017/09/12 12:41:46 - Spark Submit - at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2017/09/12 12:41:46 - Spark Submit - Caused by: java.lang.RuntimeException: Zookeeper address not found from MapR Filesystem and is also not configured in Yarn configuration.
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.yarn.client.MapRZKRMFinderUtils.mapRZkBasedRMFinder(MapRZKRMFinderUtils.java:99)
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.updateCurrentRMAddress(MapRZKBasedRMFailoverProxyProvider.java:64)
2017/09/12 12:41:46 - Spark Submit - at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.getProxy(MapRZKBasedRMFailoverProxyProvider.java:131)
2017/09/12 12:41:46 - Spark Submit - ... 21 more
2017/09/12 12:41:46 - spark_submit - Finished job entry [Spark Submit] (result=[false])
2017/09/12 12:41:46 - spark_submit - Starting entry [Spark Submit]
2017/09/12 12:41:46 - Spark Submit - Submitting Spark Script
IN my opinion the PDI tells you the problem on line 6:
Class path contains multiple SLF4J bindings
and a reference to detailed explanations 4 lines below: http://www.slf4j.org/codes.html#multiple_bindings.The ambiguous class is StaticLoggerBinder which may be in
opt/mapr/lib/slf4j-log4j12-1.7.12.jar
or inopt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar
.Remove one of them and restart.