Extract signature image from PDF signed with Adobe Sign

340 Views Asked by At

I'm trying to extract the signature image from a PDF signed with Adobe Sign. Not sure how Adobe adds this image.

enter image description here

I have tested in Java with iText and PDFBox. Normally when you traverse through the PDF structure with some tools e.g iText RUPS you can find the images like the visible signature and even download it directly in the tool. I didn't find it in this case. I did find though the XObject what I think holds the signature image.

enter image description here

The stream looks like this:

q
    q
        0.07843 0.45098 0.90196 RG
        0.07843 0.45098 0.90196 rg
        1 w
        q
            BT
                1 0 0 1 0 1.78 Tm
                /F1 6 Tf
                0.07843 0.45098 0.90196 rg
                ({0009002000270027002a00010018002a002d0027001f00010497000200300022000104400443047600010441043f044104420001044004410477043f044300010008000e001506d404410498}) Tj
                0 g
            ET
        Q
        0 8.28 m
            103.55 8.28 l
        S
    Q
    0 0 0 RG
    0 0 0 rg
    q
        1 0 0 1 0 0 cm
        /Xf2 Do
    Q
Q

And has a reference to Xf2 which looks like this:

1 J
1 j
1.000 w
30.315 10.580 m
    30.315 10.580 30.315 10.580 30.315 10.587 c
S
1.000 w
30.315 10.587 m
    30.315 10.594 30.315 10.608 30.315 10.627 c
S
1.000 w
30.315 10.627 m
    30.315 10.646 30.315 10.670 30.315 10.683 c
S
...

Can this be extracted as an image?


Proof of concept with PDFBox:

try (PDDocument document = PDDocument.load(pdf)) {

            PDAnnotation annotation = document.getPage(0).getAnnotations().get(0);
            PDResources resources = annotation.getPage().getResources();
            Iterator<COSName> names = resources.getXObjectNames().iterator();
            PDXObject xi3Object = resources.getXObject(names.next());
            PDStream xi3Stream = xi3Object.getStream();

// OR 

            COSDictionary page = (COSDictionary) document.getDocument().getObjectsByType(COSName.ANNOT).get(0)
                    .getDictionaryObject(COSName.P);
            COSDictionary resources = (COSDictionary) page.getItem(COSName.RESOURCES);
            COSDictionary xObject = (COSDictionary) resources.getItem(COSName.XOBJECT);
            COSObject xi3Object = (COSObject) xObject.getItem("Xi3");
            COSStream xi3Stream = (COSStream) xi3Object.getObject();

            PDFormXObject pdFormXObject = new PDFormXObject(xi3Stream);

            PDPage page = new PDPage(new PDRectangle(pdFormXObject.getBBox().getWidth(),
                    pdFormXObject.getBBox().getHeight()));

            try (PDDocument tempDocument = new PDDocument()) {

                PDPageContentStream contents = new PDPageContentStream(tempDocument, page);
                AffineTransform affineTransform = new AffineTransform();
                affineTransform.setToTranslation(0, 0);
                affineTransform.setToScale(0.75d, 0.75d);
                pdFormXObject.setMatrix(affineTransform);
                contents.drawForm(pdFormXObject);
                contents.restoreGraphicsState();
                contents.close();
                tempDocument.addPage(page);
                PDFRenderer renderer = new PDFRenderer(tempDocument);
                BufferedImage image = renderer.renderImageWithDPI(0, 600);
                try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
                    ImageIO.write(image, "png", baos);
                    return baos.toByteArray();
                }
            }
        }
2

There are 2 best solutions below

1
On

Your Question was can a signature squiggle (vector graphic) be extracted as an image?

So there are more unknows than answers, in such a question, like where do you need the extraction to be or converted into?

Here on left we can see the multiple components of a signature annotation. and once we know what nested object is the artwork (here number 14 0 obj) we can copy the stream data into another PDF. As seen on the right.

enter image description here

since it is a screen graphic we can copy and paste to any graphics application and save in any image format you wish. However to maintain the simplicity of the source it is best copied into an SVG editor like Inkscape.

Or simpler yet since it is now an isolated PDF object simply convert to SVG for forensic use in a web page, or on the bosses debit account, etc. mutool convert -o output.svg input.pdf

enter image description here

1
On

As discussed in the comments, the AnnotationDrawer from this old answer, generalized to work for arbitrary form XObjects, could be used to render the pen signatures from Adobe Acrobat Sign signed documents to separate bitmap images.

That annotation drawer can be generalized (and ported to PDFBox 3) as follows:

public class ContentStreamRenderer extends PDFRenderer {
    public ContentStreamRenderer(PDDocument document) throws NoSuchFieldException, SecurityException {
        super(document);
        pageImageField = PDFRenderer.class.getDeclaredField("pageImage");
        pageImageField.setAccessible(true);
        pageDrawerParametersConstructor = (Constructor<PageDrawerParameters>) PageDrawerParameters.class.getDeclaredConstructors()[0];
        pageDrawerParametersConstructor.setAccessible(true);
    }

    @Override
    protected PageDrawer createPageDrawer(PageDrawerParameters parameters) throws IOException {
        PageDrawer pageDrawer = new ContentStreamPageDrawer(parameters);
        pageDrawer.setAnnotationFilter(getAnnotationsFilter());
        return pageDrawer;
    }

    public BufferedImage renderImage(PDPage page, PDContentStream contentStream, float scale, ImageType imageType, RenderDestination destination) throws IOException, IllegalArgumentException, IllegalAccessException, InstantiationException, InvocationTargetException {
        try {
            PDRectangle bBox = contentStream.getBBox();
            Rectangle2D bounds = bBox.transform(contentStream.getMatrix()).getBounds2D();
            bBox = new PDRectangle((float)bounds.getMinX(), (float)bounds.getMinY(), (float)bounds.getWidth(), (float)bounds.getHeight());
            
            float widthPt = bBox.getWidth();
            float heightPt = bBox.getHeight();

            // PDFBOX-4306 avoid single blank pixel line on the right or on the bottom
            int widthPx = (int) Math.max(Math.floor(widthPt * scale), 1);
            int heightPx = (int) Math.max(Math.floor(heightPt * scale), 1);

            BufferedImage image = new BufferedImage(widthPx, heightPx, BufferedImage.TYPE_INT_ARGB);
            pageImageField.set(this, image);

            // use a transparent background if the image type supports alpha
            Graphics2D g = image.createGraphics();
            g.setBackground(new Color(0, 0, 0, 0));
            g.clearRect(0, 0, image.getWidth(), image.getHeight());
            g.scale(scale, scale);

            RenderingHints actualRenderingHints =
                    getRenderingHints() == null ? createDefaultRenderingHints(g) : getRenderingHints();
            PageDrawerParameters parameters =
                    pageDrawerParametersConstructor.newInstance(this, page, isSubsamplingAllowed(), destination,
                            actualRenderingHints, getImageDownscalingOptimizationThreshold());
            PageDrawer drawer = createPageDrawer(parameters);
            if (drawer instanceof ContentStreamPageDrawer) {
                ((ContentStreamPageDrawer)drawer).setContentStream(contentStream);
            }
            drawer.drawPage(g, bBox);

            g.dispose();

            if (imageType != ImageType.ARGB)
            {
                int biType = -1;
                switch (imageType) {
                case ARGB:   biType = BufferedImage.TYPE_INT_ARGB; break;
                case BGR:    biType = BufferedImage.TYPE_3BYTE_BGR; break;
                case BINARY: biType = BufferedImage.TYPE_BYTE_BINARY; break;
                case GRAY:   biType = BufferedImage.TYPE_BYTE_GRAY; break;
                case RGB:    biType = BufferedImage.TYPE_INT_RGB; break;
                }
                BufferedImage newImage = 
                        new BufferedImage(image.getWidth(), image.getHeight(), biType);
                Graphics2D dstGraphics = newImage.createGraphics();
                dstGraphics.setBackground(Color.WHITE);
                dstGraphics.clearRect(0, 0, image.getWidth(), image.getHeight());
                dstGraphics.drawImage(image, 0, 0, null);
                dstGraphics.dispose();
                image = newImage;
            }

            return image;
        } finally {
            pageImageField.set(this, null);
        }
    }

    private RenderingHints createDefaultRenderingHints(Graphics2D graphics)
    {
        RenderingHints r = new RenderingHints(null);
        r.put(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BICUBIC);
        r.put(RenderingHints.KEY_RENDERING, RenderingHints.VALUE_RENDER_QUALITY);
        r.put(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON);
        return r;
    }

    final Field pageImageField;
    final Constructor<PageDrawerParameters> pageDrawerParametersConstructor;
}

(ContentStreamRenderer class derived from PDFRenderer)

with this helper class

public class ContentStreamPageDrawer extends PageDrawer {
    public ContentStreamPageDrawer(PageDrawerParameters parameters) throws IOException {
        super(parameters);
    }

    public PDContentStream getContentStream() {
        return contentStream;
    }

    public void setContentStream(PDContentStream contentStream) {
        this.contentStream = contentStream;
    }

    //
    // Overrides that make drawPage only draw the set content stream
    // if one is set. Otherwise, these overrides delegate to PageDrawer.
    //
    @Override
    public void processPage(PDPage page) throws IOException {
        if (contentStream == null) {
            super.processPage(page);
        } else {
            processChildStream(contentStream, page);
        }
    }

    @Override
    public void showAnnotation(PDAnnotation annotation) throws IOException {
        if (contentStream == null) {
            super.showAnnotation(annotation);
        }
    }

    PDContentStream contentStream = null;
}

(ContentStreamPageDrawer helper of ContentStreamRenderer)

These classes constitute a customization of the standard PDFBox PDF renderer and page drawer, a generalization to render individual content streams of a PDF.

Unfortunately PDFBox has made many member variables and methods private or at least package protected. As I didn't want to copy the whole classes, I, therefore, had to use reflection which will not work in some contexts. (Keeping members hidden away like that really is a PITA...)

To render the form XObjects in the example PDF signed using Adobe Acrobat Sign, you can use those classes like this:

PDDocument pdDocument = Loader.loadPDF(PDF_SOURCE)
ContentStreamRenderer renderer = new ContentStreamRenderer(pdDocument);
PDPage pdPage = pdDocument.getPage(0);
PDResources pdResources = pdPage.getResources();

for (COSName xObjectName : pdResources.getXObjectNames()) {
    PDXObject pdXObject = pdResources.getXObject(xObjectName);
    if (pdXObject instanceof PDFormXObject) {
        PDFormXObject pdFormXObject = (PDFormXObject) pdXObject;
        BufferedImage image = renderer.renderImage(pdPage, pdFormXObject, 4, ImageType.RGB, RenderDestination.VIEW);
        ImageIO.write(image, "png", new File("test_document_signed-1-" + xObjectName.getName() + ".png"));

        PDResources innerResources = pdFormXObject.getResources();
        for (COSName innerXObjectName : innerResources.getXObjectNames()) {
            PDXObject innerXObject = innerResources.getXObject(innerXObjectName);
            if (innerXObject instanceof PDFormXObject) {
                PDFormXObject innerFormXObject = (PDFormXObject) innerXObject;
                image = renderer.renderImage(pdPage, innerFormXObject, 4, ImageType.RGB, RenderDestination.VIEW);
                ImageIO.write(image, "png", new File("test_document_signed-1-" + xObjectName.getName() + "-" + innerXObjectName.getName() + ".png"));
            }
        }
    }
}

(RenderAcrobatSignSignatures test testRenderTestDocumentSigned)

The results:

test_document_signed-1-Xi3.png

test_document_signed-1-Xi3-Xf2.png

test_document_signed-1-Xi4.png


In your proof of concept you extract the form XObjects and draw them on pages of temporary PDFs. That approach has the advantage of not needing reflection (or copying the whole PDFBox rendering classes). It may have the disadvantage, though, of requiring more resources.