Extracting HTML tags with specific attributes in GREL

153 Views Asked by At

I can easily extract a tag the first time it appears

<skos:prefLabel> Espitaleta, Lina </skos:prefLabel>

And every time it appears:

<skos:prefLabel> Espitaleta, Lina </skos:prefLabel>
<skos:prefLabel xml:lang="en-US"> Espitaleta, Lina </skos:prefLabel>
<skos:prefLabel xml:lang="fr-FR"> Lina Espitaleta </skos:prefLabel>

But how do I extract only those tags with a specific attribute?

<skos:prefLabel xml:lang="fr-FR"> Lina Espitaleta </skos:prefLabel>

Thanks

1

There are 1 best solutions below

2
On BEST ANSWER

I'm guessing, based on your example, that you are looking for a specific attribute value, not just an attribute.

The GREL implementation uses JSoup internally, so you want to look at their selector syntax to figure out how to do this. Something along the lines of:

value.parseHtml().select(your selector here)

should get you what you want. There is a slight nuance in dealing with namespaces which are handled differently in the tag name vs attribute name, so you'll want something like:

value.parseHtml().select('skos|prefLabel[xml:lang="fr-FR"]')

If you really did mean attribute, not attribute value, you can simplify it to just:

value.parseHtml().select('skos|prefLabel[xml:lang]')