Below is the sample content :-
<WKEXT-META-ATTRS>
<WKEXT-META-ATTR NAME="uri" VALUE="http://sample.com/ceres/wk-us/Concept/i8148" DATA-TYPE="OTHER"></WKEXT-META-ATTR></WKEXT-META-ATTRS></WKEXT-META-OBJECT>
<WKEXT-META-OBJECT NAME="UNIONREPINFO" ID="ext-met-0005" PUBLISHER-URI="http://wk-us.com/meta/publishers/#CCH">
<WKEXT-META-ATTRS>
<WKEXT-META-ATTR NAME="UnionRep" VALUE="Jim Gookins" DATA-TYPE="OTHER"></WKEXT-META-ATTR></WKEXT-META-ATTRS></WKEXT-META-OBJECT>
<WKEXT-META-OBJECT NAME="TOPICALSUBJECTINFO" ID="ext-met-0006" PUBLISHER-URI="http://sample.com/meta/publishers/#CCH">
<WKEXT-META-ATTRS>
<WKEXT-META-ATTR NAME="uri" VALUE="http://sample.com/ceres/sample/Concept/i8173" DATA-TYPE="OTHER"></WKEXT-META-ATTR></WKEXT-META-ATTRS></WKEXT-META-OBJECT>
<WKEXT-META-OBJECT NAME="TOPICALSUBJECTINFO" ID="ext-met-0007" PUBLISHER-URI="http:/sample/meta/publishers/#CCH">
I want to extract the VALUE of uri -- "http://sample.com/ceres/wk-us/Concept/i8141
I am currently trying out with below code:-
with open ("sample.sgm","r")as f:
contents =f.read()
soup = BeautifulSoup(contents, 'lxml')
s = soup.find('wkext-meta-attr').attrs
#for a in s:
# t = a.attrs
# for key,value in t.items():
# alias_text.append(t['normval'])
#print(alias_text)
#df = DataFrame(alias_text, columns=['arbitratorname'])
#s_topic=soup.find('WKEXT-META-ATTRS'=
print(s)
I am not able to figure out how to obtain the exact value . Any help will be much appreciated!!!
If you want to retrieve the value of each
wkext-meta-attr
, You can use the`.findAll() method and then loop through each element. Check whether the below code fulfills your task: