extracting text with xpath with different nodes

101 Views Asked by At

I'm currently trying to extract some text from a website with xPath and Rapidminer. I want to extract the "270€" from the following code:

<dd class="grid-item three-fifths"> 
<span class="is1-operator">+</span> 
270 € 
</dd>

I tried the following which didn't work.

//h:dd[@class='grid-item three-fifths']//text()

Thanks for your help :)

2

There are 2 best solutions below

8
On

Your Xpath returns 3 text nodes:

  1. ""
  2. "+"
  3. "270€"

Try below XPath to fetch only "270€"

//h:dd[@class='grid-item three-fifths']/text()[string-length() > 0]
0
On

As mentioned in previous post string-length filter can be used but [string-length() > 0] still brings 3 nodes. Both 'enter' and '+' text contents have a character.

[string-length() > 1] should work.

If you are sure about item position (in this case it is 3rd position)

//dd[@class='grid-item three-fifths']//text()[3]

If you are sure it is always last item:

//dd[@class='grid-item three-fifths']/text()[last()]

You can get text node after span in dd:

//dd[@class='grid-item three-fifths']//span/following-sibling::text()

Look for euro sign:

//dd/text()[contains(.,'€')]