I have a simple query that I'm trying to run in Hive 0.14:
select sum(tb.field1), sum(tb.field2), tb.month from dbwork.mytable tb
group by tb.month;
that is partitioned by month
.
It gets stuck on the map phase:
INFO : Map 1: -/- Reducer 2: 0/486
INFO : Map 1: -/- Reducer 2: 0/486
INFO : Map 1: -/- Reducer 2: 0/486
INFO : Map 1: -/- Reducer 2: 0/486
The logs have not been generated yet, so not sure how to debug. What's going on? Why the task never starts?
This behavior usually appears when cluster has not enough resources to allocate to job. How much data your trying to play with, check Hadoop service statuses in ambari if you are using hortonworks and other admin dashboards if you are using any other distribution.