I have an employee hive table with column Name , Department , City and i want to retrieve the data based on names of the employee using IN operation in HiveContext.sql() function but it is throwing pyspark.Analysis Exception.please look at example below.
employee table:
Name Department City
Ram FDE Mumbai
Ramesh CTZ Pune
Suraj FDE Chennai
Varun CTZ Delhi
Query :
SELECT * from employee WHERE Name in ('Ramesh' , 'Varun')
code snippet from spark program:
namesList= ['Ramesh' , 'Varun']
data = HiveContext.sql('SELECT * from employee WHERE Name in ({namesList})'.format(namesList = namesList))
I tried to modify and pass the strings instead of list but the error remains same
Error:pyspark.AnalysisException : structType field
please help me on this and suggest me if i'm doing something wrong here.
You should get rid of square brackets from the python list when creating the query -
str(namesList)[1:-1]