Flattened filled PDF form is 'of invalid format' on Android, and shows blank fields in Chrome extension

91 Views Asked by At

I'm using pypdf (3.17.4) to fill a fillable PDF then flatten the fields. The resulting PDF displays correctly in Acrobat Reader, but, not on my Samsung S9, and not in the Chrome extension on Windows:

  • Samsung S9, tapping the link to the pdf from the gmail app: a toast shows up 'Cannot display PDF ( is of invalid format)'

  • Windows 10 running Chrome: the pdf instantaneously shows everything correctly including the filled-out fields, but within a half second all the fields become blank. Hovering or clicking in those fields doesn't make the values show up again, but, the cursor does change from an arrow to a bar when the mouse is over the fields. 'Open in desktop app' does open the desktop Acrobat reader app, and opens the PDF with all of the values shown. Upon closing that PDF in the desktop Acrobat reader app, a prompt shows up asking whether I'd like to save the changes to that PDF - even though I didn't click anywhere at all.

So there must be something I'm not understanding about the 'flattening' process. Maybe it's not accomplishing what I had hoped? Maybe 'flattening' is just one part of the necessary process, or maybe I'm misusing that term? In effect, I'd like to change all those filled-out fields into simple background text or labels rather than fields - or, whatever it takes to resolve the two display issues above.

Here's the code I use to generate the filled-out PDFs - the values are all filling out correctly, so, the code that builds solutionDict and fieldsDict is omitted here:

from pypdf import PdfReader,PdfWriter
from pypdf.generic import NameObject,NumberObject
...
def makePDFs():

    # from https://pypdf.readthedocs.io/en/stable/user/forms.html

    reader = PdfReader(fillable_pdf)
    fields = reader.get_fields()

    for mapID in solutionDict.keys():
        # remove spaces from key names to get corresponding pdf field names
        fieldsDict={k.replace(' ',''):v for k,v in solutionDict[mapID].items()}
        fieldsDict['MAPID']=mapID
        print('building PDF for '+mapID+'...')
        writer = PdfWriter()
        writer.append(reader)
        writer.update_page_form_field_values(
            writer.pages[0],
            fieldsDict,
            auto_regenerate=False,
        )

        # flatten i.e. make the final pdf non-editable
        #  taken from https://stackoverflow.com/a/55302753/3577105
        for page in writer.pages:
            for j in range(0, len(page['/Annots'])):
                writer_annot = page['/Annots'][j].get_object()
                # flatten all the fields by setting bit position to 1
                # use loop below if only specific fields need to be flattened.
                writer_annot.update({
                    NameObject("/Ff"): NumberObject(1)  # changing bit position to 1 flattens field
                })

        with open('RepeaterTest_'+str(mapID)+'.pdf', 'wb') as output_stream:
            writer.write(output_stream)
1

There are 1 best solutions below

0
On BEST ANSWER

This solution seems to work, for all except Acrobat Viewer phone app --> View settings --> Reading mode.

https://stackoverflow.com/a/73655665/3577105

Thanks to @JeremyM4n

The solution doesn't rely upon NeedAppearances, and instead adds stream objects directly. See the solution text and comments for details.

Mainly, note that these imports are needed:

from pypdf import PdfReader,PdfWriter
from pypdf.generic import NameObject,NumberObject,TextStringObject,encode_pdfdocencoding
from pypdf.constants import AnnotationDictionaryAttributes,InteractiveFormDictEntries,PageAttributes,StreamAttributes,FilterTypes,FieldDictionaryAttributes,FieldFlag
from pypdf.filters import FlateDecode

and note in the final comments that this line didn't work in my case and had to be commented out - not sure if there are any ill effects?

# writer.clone_reader_document_root(template)