I am working on a script where a pdf gets uploaded to s3 and then runs a lambda that uses textract to perform some tasks. The script seems to work well with some pdf's, however, there are other pdfs that cause this exception.

I noticed that most times when a pdf with spaces in the name is created, it will fail. However, I added a function that changes the name to replace dashes with spaces. For some reason, this fix worked with one file, but has since failed with the rest. I'm using textract with pdfs that have boxes for entry (so not just reading stuff), and they get filled in digitally (not by hand).

I am still trying to understand why I'm getting this error and am not sure the spaces are the only cause. Does anyone have any advice as to what could be going on or how to better figure out how to fix this?

0

There are 0 best solutions below