I am building automation to deal with email alerts that we receive. The final step is to be able to extract the username involved in the alert and from research it looked like it should be fairly easy to extract this from the original email. See below for a snippet of the HTML I am attempting to extract the value from.
HTML Snippet:
<tr>
<td style="border:solid #DBDCDC 1.0pt;padding:3.75pt 3.75pt 3.75pt 3.75pt">
<p class="MsoNormal" align="right" style="text-align:right">
<span style="font-size:9.0pt;color:black">source_username
</span>
<span style="font-size:9.0pt">
<o:p></o:p>
</span>
</p>
</td>
<td width="100%" style="width:100.0%;border:solid #DBDCDC 1.0pt;border-left:none;background:#FAFAFA;padding:3.75pt 3.75pt 3.75pt 3.75pt;max-width:100%">
<p class="MsoNormal">
<span style="font-size:9.0pt;color:black">ServicePrincipal_64e90aaf-abe7-4fa8-b0f7-a56db5a780bc
</span>
<span style="font-size:9.0pt">
<o:p></o:p>
</span>
</p>
</td>
</tr>
I have made two attempts with Parsel and BeautifulSoup, both of which didn't work.
Parsel attempt:
sel = Selector(text=html)
# Find the td tag that contains the 'source_username' string
source_tag = sel.xpath('//td[contains(.//text(), "source_username")]')[0]
print(source_tag)
# Extract the value from the tag
source_username = source_tag.xpath('./following-sibling::td[1]//text()').get().strip()
print(source_username)
BeautifulSoup attempt:
soup = BeautifulSoup(html, 'html.parser')
# Find the tag that contains the source_username
source_tag = soup.find('td', string='source_username')
print(source_tag)
# Extract the value from the tag
source_username = source_tag.find_next_sibling('td').text.strip()
print(source_username)
If you have access t othe HTML :
you can pass
dataattribute with the desired value something like this :and get the data using JS like this :
If you don't have
try something like this :