SgmlReader infinite loop on large document?

30 Views Asked by user1664043 At 18 August 2025 at 06:37

I've got this project to scrape data off of the SEC Edgar site. Part of the task is to get the meat of the whole filing, and I was just testing some of that today.

I ran into this somewhat large filing (https://www.sec.gov/Archives/edgar/data/355437/000119312520189547/0001193125-20-189547.txt) that's about 110 meg.

I was breaking up the package to the constituent <DOCUMENT> nodes and processing them differently, based on the FILENAME node value. For the types that were html/xml based, I just used

SgmlReader.ReadInnerXml();

to grab the innards, but on this large filing, it appears to go into this infinite loop. It ran for 15 minutes before I broke in with the debugger, and it was hung on that call.

Has anyone ever run into that before?

I'm using SqmlReader 1.8.16.

I saw a very old comment on a changelog page saying that there was such a bug with improperly terminated html comments but that was listed as fixed a good number of releases ago.

Thanks

Original Q&A

SgmlReader infinite loop on large document?

There are 0 best solutions below

Related Questions in SGMLREADER

Trending Questions

Popular # Hahtags

Popular Questions