Hadoop Map Job gets stuck due to memory error

Question

Hadoop Map Job gets stuck due to memory error

893 Views Asked by sgp At 23 June 2015 at 06:18

I am running a Hive script to do some manipulations in a table consisting of 5452689 rows and 7GB data size. However, my map reduce gets stuck at somewhere around 70% and gives a no space left error

The error is as follows:

Hadoop job information for Stage-4: number of mappers: 27; number of reducers: 1
2015-06-23 01:47:43,748 Stage-4 map = 0%,  reduce = 0%
2015-06-23 01:48:04,550 Stage-4 map = 1%,  reduce = 0%, Cumulative CPU 258.06 sec
2015-06-23 01:48:06,661 Stage-4 map = 2%,  reduce = 0%, Cumulative CPU 331.98 sec
2015-06-23 01:48:07,796 Stage-4 map = 3%,  reduce = 0%, Cumulative CPU 370.35 sec
2015-06-23 01:48:09,931 Stage-4 map = 4%,  reduce = 0%, Cumulative CPU 406.65 sec
2015-06-23 01:48:28,778 Stage-4 map = 7%,  reduce = 0%, Cumulative CPU 973.33 sec
2015-06-23 01:48:30,987 Stage-4 map = 8%,  reduce = 0%, Cumulative CPU 1034.17 sec
2015-06-23 01:48:34,251 Stage-4 map = 11%,  reduce = 0%, Cumulative CPU 1121.22 sec
2015-06-23 01:48:35,419 Stage-4 map = 13%,  reduce = 0%, Cumulative CPU 1173.4 sec
2015-06-23 01:48:36,458 Stage-4 map = 18%,  reduce = 0%, Cumulative CPU 1191.29 sec
2015-06-23 01:48:37,499 Stage-4 map = 22%,  reduce = 0%, Cumulative CPU 1215.44 sec
2015-06-23 01:48:38,607 Stage-4 map = 29%,  reduce = 0%, Cumulative CPU 1267.07 sec
2015-06-23 01:48:39,671 Stage-4 map = 32%,  reduce = 0%, Cumulative CPU 1289.57 sec
2015-06-23 01:48:40,883 Stage-4 map = 34%,  reduce = 0%, Cumulative CPU 1309.96 sec
2015-06-23 01:48:41,922 Stage-4 map = 36%,  reduce = 0%, Cumulative CPU 1366.31 sec
2015-06-23 01:48:48,693 Stage-4 map = 39%,  reduce = 0%, Cumulative CPU 1554.9 sec
2015-06-23 01:48:54,121 Stage-4 map = 40%,  reduce = 0%, Cumulative CPU 1709.04 sec
2015-06-23 01:49:00,973 Stage-4 map = 43%,  reduce = 0%, Cumulative CPU 1895.86 sec
2015-06-23 01:49:03,099 Stage-4 map = 46%,  reduce = 0%, Cumulative CPU 1976.89 sec
2015-06-23 01:49:05,180 Stage-4 map = 49%,  reduce = 0%, Cumulative CPU 2003.08 sec
2015-06-23 01:49:06,225 Stage-4 map = 58%,  reduce = 0%, Cumulative CPU 2062.33 sec
2015-06-23 01:49:07,353 Stage-4 map = 60%,  reduce = 0%, Cumulative CPU 2067.9 sec
2015-06-23 01:49:08,388 Stage-4 map = 66%,  reduce = 0%, Cumulative CPU 2087.55 sec
2015-06-23 01:49:09,551 Stage-4 map = 74%,  reduce = 2%, Cumulative CPU 2112.96 sec
2015-06-23 01:49:10,607 Stage-4 map = 75%,  reduce = 2%, Cumulative CPU 2118.14 sec
2015-06-23 01:49:11,669 Stage-4 map = 19%,  reduce = 0%, Cumulative CPU 433.75 sec
2015-06-23 01:49:12,699 Stage-4 map = 16%,  reduce = 0%, Cumulative CPU 350.93 sec
2015-06-23 01:49:14,760 Stage-4 map = 15%,  reduce = 0%, Cumulative CPU 263.95 sec
2015-06-23 01:49:26,177 Stage-4 map = 16%,  reduce = 0%, Cumulative CPU 341.29 sec
2015-06-23 01:49:31,365 Stage-4 map = 15%,  reduce = 0%, Cumulative CPU 334.86 sec
2015-06-23 01:49:39,713 Stage-4 map = 23%,  reduce = 0%, Cumulative CPU 300.53 sec
2015-06-23 01:49:40,758 Stage-4 map = 15%,  reduce = 0%, Cumulative CPU 300.53 sec
2015-06-23 01:49:43,868 Stage-4 map = 100%,  reduce = 100%, Cumulative CPU 263.95 sec
MapReduce Total cumulative CPU time: 4 minutes 23 seconds 950 msec
Ended Job = job_1434953415026_0374 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1434953415026_0374_m_000025 (and more) from job job_1434953415026_0374
Examining task ID: task_1434953415026_0374_m_000004 (and more) from job job_1434953415026_0374
Examining task ID: task_1434953415026_0374_m_000019 (and more) from job job_1434953415026_0374
Examining task ID: task_1434953415026_0374_m_000003 (and more) from job job_1434953415026_0374
Examining task ID: task_1434953415026_0374_m_000022 (and more) from job job_1434953415026_0374
Examining task ID: task_1434953415026_0374_m_000002 (and more) from job job_1434953415026_0374
Examining task ID: task_1434953415026_0374_m_000005 (and more) from job job_1434953415026_0374

Task with the most failures(4): 
-----
Task ID:
  task_1434953415026_0374_m_000000

URL:
  http://pfaquaap1u:8088/taskdetails.jsp?jobid=job_1434953415026_0374&tipid=task_1434953415026_0374_m_000000
-----
Diagnostic Messages for this Task:
FSError: java.io.IOException: No space left on device

However, running a df shows that I have over 80 GB free on the local system. df -i also shows that none of the inodes are being overused. The script runs fine if I use a smaller data input. Can anybody tell me what should I do here? Thanks in advance.

Original Q&A

There are 1 best solutions below

**maxymoo** · Answer 1 · 2015-06-23T07:13:35.817000

maxymoo On 23 June 2015 at 07:13

I don't think df will show you the hdfs disk usage. Instead try hadoop fs -df -h (see this question: HDFS free space available command)

I think you might just have to add more nodes to your cluster!

Hadoop Map Job gets stuck due to memory error

There are 1 best solutions below

Related Questions in HADOOP

Related Questions in MAPREDUCE

Related Questions in HIVE

Related Questions in IOEXCEPTION

Trending Questions

Popular # Hahtags

Popular Questions