I am using Python user defined function in Apache hive to change characters from lower case character to upper case. I am getting error as "Hive Runtime Error while closing operators".
Below are the query I tried:
describe table1;
OK
item string
count int
city string
select * from table1;
aaa 1 tokyo
aaa 2 london
bbb 3 washington
ccc 4 moscow
ddd 5 bejing
From the above table, item and city field should change from lower case to upper case and count should increment by 10.
Python script used:
cat caseconvert.py
import sys
import string
for line in sys.stdin:
line = line.strip()
item,count,city=line.split('\t')
ITEM1=item.upper()
COUNT1=count+10
CITY1=city.upper()
print '\t'.join([ITEM1,str(COUNT1),FRUIT1])
Inserting table1 data to table2
create table table2(ITEM1 string, COUNT1 int, CITY1 string) row format delimited fields terminated by ',';
add FILE caseconvert.py
insert overwrite table table2 select TRANSFORM(item,count,city) using 'python caseconvert.py' as (ITEM1,COUNT1,CITY1) from table1;
If I execute I am getting the following error. I could'nt able to trace the issue. Can I know it going wrong?
Total MapReduce jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201508151858_0014, Tracking URL = http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201508151858_0014
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_201508151858_0014
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2015-08-15 22:24:06,212 Stage-1 map = 0%, reduce = 0%
2015-08-15 22:25:01,559 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201508151858_0014 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201508151858_0014
Examining task ID: task_201508151858_0014_m_000002 (and more) from job job_201508151858_0014
Task with the most failures(4):
-----
Task ID:
task_201508151858_0014_m_000000
URL:
http://localhost.localdomain:50030/taskdetails.jsp?jobid=job_201508151858_0014&tipid=task_201508151858_0014_m_000000
-----
Diagnostic Messages for this Task:
java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:224)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 20003]: An error occurred when trying to close the Operator running your custom script.
at org.apache.hadoop.hive.ql.exec.ScriptOperator.close(ScriptOperator.java:488)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:570)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:5
FAILED: Execution Error, return code 20003 from org.apache.hadoop.hive.ql.exec.MapRedTask. An error occurred when trying to close the Operator running your custom script.
MapReduce Jobs Launched:
Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
In the last line of your Python script, where you print the output to STDOUT, you call FRUIT1 without having defined it. This should be CITY1. You have also imported string but not used it. I'd write the script a bit differently:
Then, I'd use a CREATE TABLE AS SELECT query (assuming both TABLE1 and your python script live in the HDFS):
This has worked for me. However, there is a much easier way to make the transformations you want using Hive built-in functions:
upper(string A) >>> returns the string resulting from converting all characters of A to upper case. For example, upper('fOoBaR') results in 'FOOBAR'.
And of course for city you could make it: (city + 10) AS city1.
So, TABLE2 could be created as follows:
Much less trouble than writing your custom UDF.