Why can't we calculate job execution time in Hadoop?

2.4k Views Asked by Flowra At 11 November 2014 at 23:04

My question is related to Straggler problem. In sort, it's an algorithm and we can know its complexity and calculate the running time when executed on a constant set of data.

Why can't we acquire job execution time in Hadoop ?

If we can acquire the job execution time or task execution time, we can know the straggler tasks quickly without needing algorithms to know which task is Straggler.

Original Q&A

There are 2 best solutions below

Chandrasekhar On 16 October 2015 at 10:06 BEST ANSWER

You should not estimate how much time a job will take before running that job. After running your mapreduce job, you can take an estimation of the time taken. Mapreduce always depends on your cluster capacity – RAM size, CPU Cores and network band width – and how many Reducers you set for the task.

You can only make assumptions based on your RAM size divided by the input split.

kiran Sreekumar On 12 November 2014 at 05:00

The job execution time or the task execution time will be available in the job tracker web UI.Hope that is what you are looking for.the web UI will be availlable in 50030 port of your job tracker.If its a Yarn based setup the url would be http://:8088

Why can't we calculate job execution time in Hadoop?

There are 2 best solutions below

Related Questions in HADOOP

Related Questions in MAPREDUCE

Related Questions in JOB-SCHEDULING

Trending Questions

Popular # Hahtags

Popular Questions