Spark XML API - Text between tags

270 Views Asked by At

Using Spark XML I'm trying to get to the text that appears between 2 elements within a root element. For example:

<a>
  <b>x</b> cannot be seen <b>y</b> neither
</a>

I would like to get to the text between b elements (text cannot be seen and text neither)

Below is the code that I have tried

val dfTagA = spark.read.format("com.databricks.spark.xml").option("rowTag","tagA").load("in/test.xml") 
dfTagA.printSchema() 
dfTagA.show(false)

Any ideas how to get this through a custom schema using Spark XML libraries?

Thanks

0

There are 0 best solutions below