Cleaning Unstructured PDF data

370 Views Asked by At

Raw Data: Given is a PDF data containing the student placement details of a university. It is in a completely unstructured form and needs to be cleaned up before processing.

The Expected CSV file output:

I tried importing the pdf from inside an excel spreadsheet. Tried converting it to .xlsx and then cleansing. They still resulted in unstructured data.

I do not have any prior experience working with power queries, web queries or scraping data.

Suggest all possible methods to clean the data and put it into a CSV file. It would be great to get a step-by-step procedure of what needs to be done, the tools and frameworks to be used in order to obtain the desired results.

0

There are 0 best solutions below