I'm writing a Google app engine project in python. I need to scrap the banks sites, get the exchange rate from them.
the example of html:
<tr>
<td width="2"><img src="./images/zero.gif" width="2" height="2" border="0" /></td>
<td width="41" class="curvalsh" align="left" valign="middle"><font color="#DC241F">USD</font></td>
<td width="41" class="curvalsh" align="right" valign="middle"><b> 15.20 </b></td>
<td width="4" align="left" valign="middle"><img src="./images/zero.gif" width="2" height="20" border="0" hspace="1"></td>
<td width="41" class="curvalsh" align="right" valign="middle"><b> 16.00 </b></td>
<td width="4" align="left" valign="middle"><img src="./images/zero.gif" width="2" height="20" border="0" hspace="1"></td>
<td width="41" class="curvalsh" align="right" valign="middle"> - </td>
<td width="2" align="left" valign="middle"><img src="./images/zero.gif" width="2" height="20" border="0" hspace="1"></td>
</tr>
I need to get the next two tags with text after tag containing "USD" text(tags with 15.20 and 16.00).
What i've already done is:
xpath = "//tr/td[text()='USD']/following-sibling::td/text()"
But this doesn't return anything and this is not exactly what i need, because i have to specify to get 2 tags containing text after tag "USD", because there are also tags which don't contain any text.
EDIT:
I've also tried like this, still returns nothing
xpath = "//tr/td[text()='USD']/following-sibling::td[matches(text(),'(^|\W)[0-9]+.[0-9]+($|\W)','i')]/text()"
notice that there is another tag inside
tdbefore getting to the searched text, so you can either search directly:or
in any case you will then go one level up using
..much like when browsing the file system.well, and there is another tag hiding there that you can refer directly using
b/text()or take all text under next sibling by//text()this is how it might looks: