There are a few questions similar to this on SO. However nothing has worked for me. So I am posting this question.

I am Using CDH 6.2.1

I have a workflow that has map-reduce action. The map-reduce job creates a lot of counters (I think m/r job produces ~300 counters).

I have set the cdh/yarn/config mapreduce.job.counters.max property to 8192.

I have also set the:

  • YARN Service Advanced Configuration Snippet (Safety Valve) for yarn-site.xml
  • YARN Service MapReduce Advanced Configuration Snippet (Safety Valve)
  • MapReduce Client Advanced Configuration Snippet (Safety Valve) for mapred-site.xml

If I run the map-reduce job as a stand-alone yarn job (using yarn jar command on the command-line), the job completes successfully.

When I run the job as part of the workflow:

  • On Yarn/All Applications Page I see that: the oozie launcher job completes successfully.
  • On Yarn/All Applications Page I see that: the map/reduce job completes successfully.
  • However oozie fails the job reporting: LimitExceededException: Too many counters: 121 max=120

The configuration for the mapreduce job & oozie launcher as reported by yarn has the setting:

<property>
     <name>mapreduce.job.counters.max</name>
     <value>8192</value>
     <final>true</final>
     <source>yarn-site.xml</source>
</property>

Oozie web interface System-Info/OS-Env reports that the following HADOOP_CONF_DIR: /var/run/cloudera-scm-agent/process/459-oozie-OOZIE_SERVER/yarn-conf/

In that folder I can see that the mapred-site.xml also has:

    <!--'mapreduce.job.counters.max', originally set to '8192' (final), is overridden below by a safety valve-->
  <property>
    <name>mapreduce.job.counters.max</name>
    <value>8192</value>
    <final>true</final>
  </property>

However I cannot find that property in the yarn-site.xml.

I am not sure what else I can do at this point...

1

There are 1 best solutions below

0
On

This is an oozie issue which has been resolved. However, it is not available in the current version of cloudera.

I am posting this here, in case anyone else has the same issue.