I have Spark 1.5.0 running on cluster. I want to use Hive UDF from ESRI's API. I can use these API in Spark Application but due to some issues in my cluster, I am not able to use HiveContext. I want to use Existing Hive UDF in Spark-SQL application.
// val sqlContext = new SQLContext(sc)
// import sqlContext.implicits._
// val hc = new HiveContext(sc)
// hc.sql("create temporary function ST_Point as 'com.esri.hadoop.hive.ST_Point'")
// hc.sql("create temporary function ST_Within as 'com.esri.hadoop.hive.ST_Within'")
// hc.sql("create temporary function ST_Polygon as 'com.esri.hadoop.hive.ST_Polygon'")
// val resultDF = hc.sql("select ST_Within(ST_Point(2, 3), ST_Polygon(1,1, 1,4, 4,4, 4,1))")
The above code is for HiveContext but I want to use similar thing in SparkContext so wrote something as per this-
sqlContext.sql("""create function ST_Point as 'com.esri.hadoopcom.esri.hadoop.hive.ST_Point'""")
But seems like same error I am getting. (See below)
Exception in thread "main" java.lang.RuntimeException: [1.1] failure: ``with'' expected but identifier create found
create function ST_Point as 'com.esri.hadoopcom.esri.hadoop.hive.ST_Point'
^
at scala.sys.package$.error(package.scala:27)
I tried to make functions with existing UDFs but seems like need to make scala wrapper to call java classes. I tried as below-
def ST_Point_Spark = new ST_Point()
sqlContext.udf.register("ST_Point_Spark", ST_Point_Spark _)
def ST_Within_Spark = new ST_Within()
sqlContext.udf.register("ST_Within_Spark", ST_Within_Spark _)
def ST_Polygon_Spark = new ST_Polygon()
sqlContext.udf.register("ST_Polygon_Spark", ST_Polygon_Spark _)
sqlContext.sql("select ST_Within_Spark(ST_Point_Spark(2, 3), ST_Polygon_Spark(1,1, 1,4, 4,4, 4,1))")
but in this case getting error-
Exception in thread "main" scala.reflect.internal.Symbols$CyclicReference: illegal cyclic reference involving object InterfaceAudience
at scala.reflect.internal.Symbols$Symbol$$anonfun$info$3.apply(Symbols.scala:1220)
at scala.reflect.internal.Symbols$Symbol$$anonfun$info$3.apply(Symbols.scala:1218)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
I am just wondering, Is there any way to call Hive/Java UDF without using HiveContext, directly using SqlContext. Note: This was a helpful post but not as per my requirement.