I would like to do some analysis on some properties listed in an upcoming auction. Unfortunately, the city running the auction does not publish the information in a structured format but instead provides a 700+ page PDF of the properties going up for auction.
I'm wondering if the community has any thoughts as to how I can approach parsing said PDF into a structured format for insertion into a db or to create a spreadsheet of the properties.
Here's an image of what each page represents:

And here's a page that lists some properties:

I'm comfortable with python and ruby so I don't have any issues scripting up a solution, but because the "columns" and the data in those said columns aren't necessary tied together, it seems like this would be a dubious proposition.
Any ideas would be greatly appreciated.
Convert to text with Xpdf using command
pdftotext.I converted your file with the following:
This conversion leaves text exactly in its original layout (due to
-layoutoption). Options-fand-lindicate the first and last page numbers of the range of pages to extract.From there, parsing should be simple -- a number in column 8 indicates the first line of a record, a blank line ends the record. Follow the guide for the exact positioning of elements within a record.