The external application adds break tags in my HTML table data when I copy paste the data in cell from a notepad
<tr>
<td>3.7.4</td>
<td>12133<br />43434<br />65465<br />66656</td>
<td>test</td>
</tr>
which I am unable to parse using lxml, adding entire html below
<table class="j-table jiveBorder" style="border: 1px solid #c6c6c6;" width="100%">
<thead>
<tr style="background-color: #efefef;">
<th>Header 1</th>
<th>Header 2</th>
<th>Header 3</th>
</tr>
</thead>
<tbody>
<tr>
<td>3.7.4</td>
<td>12133<br />43434<br />65465<br />66656</td>
<td>test</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
The code I used to parse this data
for tr in table.findall('.//tbody/tr'):
data = []
data.append([x.text for x in tr[1]])
print (data)
The code works perfectly when there are no break tags and all the values are inside e.g. <p>12133</p>
tag
You can use
itertext()
return