pyspark to query elastic search stuck on union of two dataframes

21 Views Asked by At

we are using pyspark to query elastic search

we have 2 indexs:

index1- with 20 docs

index2 - with 100000 docs

dataframe1 is a join between 2 dataframes:

dataframe3 - queries index1 (returns 1 row on dataframe3.collect())

dataframe4 - queries index2 (returns 1 row on dataframe4.collect())

dataframe1 = dataframe3.join(dataframe4) when i call dataframe1.collect() it returns 1 row immediatly

dataframe2 queries index1 with different query (returns 1 row on dataframe2.collect())

when i do

dataframe1.union(dataframe2).collect() it gets stuck....

what is vert strange is when i don use dataframe4 in the join everything works fine....

i am using elasticsearch-spark-30_2.12-8.9.0 Elasticsearch 8.9.2

please help

0

There are 0 best solutions below