I need to extract the uid from a .sgm file, I tried the below code but it doesn't, work can anybody help?
Sample .sgm file content:
<miscdoc n='1863099' uid='0001863099_20220120' type='seccomlett' t='frm' mdy='01/20/2022'><rname>Kimbell Tiger Acquisition Corp, 01/20/2022</rname>
<table col='2' type='txt'>
<colspec col='1' colwidth='*'>
<colspec col='2' colwidth='2*'>
<tname>Meta-data</tname>
<tbody>
<row><entry>SEC-HEADER</entry><entry>0001104659-22-005920.hdr.sgml : 20220304</entry></row>
<row><entry>ACCEPTANCE-DATETIME</entry><entry>20220120160231</entry></row>
<row><entry>PRIVATE-TO-PUBLIC</entry></row>
<row><entry>ACCESSION-NUMBER</entry><entry>0001104659-22-005920</entry></row>
<row><entry>TYPE</entry><entry>CORRESP</entry></row>
<row><entry>PUBLIC-DOCUMENT-COUNT</entry><entry>1</entry></row>
<row><entry>FILING-DATE</entry><entry>20220120</entry></row>
<row><entry>FILER</entry></row>
code I tried:
import os
# Folder Path
path = "Enter Folder Path"
# Change the directory
os.chdir(path)
# Read text File
def read_file(file_path):
with open(file_path, 'r') as f:
print(f.read())
# iterate through all file
for file in os.listdir():
# Check whether file is in text format or not
if file.endswith(".sgm"):
if 'uid' in file:
print("true")
file_path = f"{path}\{file}"
# call read text file function
read_file(file_path)
I need extract the uid value from the above sgm file, is there any other way I could do this? what should I change in my code?
SGM format may just by an XML superset. If it isn't then for this particular case (and if one could rely on the format being as shown in the question) then: