Hive: Difference between CREATE FUNCTION and CREATE TEMPORARY FUNCTION in Hive UDF

3.2k Views Asked by At

I am new to the hive and I am working on a project where I need to create a few UDFs for data wrangling. During my research, I came across two syntaxes for creating UDF from added jars

CREATE FUNCTION country AS 'com.hiveudf.employeereview.Country';

CREATE TEMPORARY FUNCTION country AS 'com.hiveudf.employeereview.Country';

I am not able to find any difference in the above two ways. Can someone explain it to me or guide me to right material?

2

There are 2 best solutions below

0
On BEST ANSWER

The main difference between create function and create tmp function is this: In Hive 0.13 or later, functions can be registered to the metastore, so they can be referenced in a query without having to create a temporary function each session.

If we use CREATE TEMPORARY FUNCTION , we will have to recreate the function every-time we start a new session.

Reference: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/ReloadFunction

1
On

CREATE TEMPORARY FUNCTION creates a new function which you can use in Hive queries as long as the session lasts . This is faster as we don't need to register the functions to the megastore. Whereas CREATE FUNCTION acts more permanently. These functions can be registered to the metastore, so they can be referenced in a query without having to create a temporary function each session.

When to use: The intermediate functions can be created using TEMPORARY which aims to just compute and can be later on used by any permanent functions. Reference