Retrieve pdf signature - preg_match_all() returns array with keys but which are empty

75 Views Asked by At

Based on this resource (PHP - how to get the signer(s) of a digitally signed PDF?) - I am trying to have a system in PHP that retrieves the digital signature of a file, so that later on I can verify if it is valid. The signature is correctly retrieved from the file, and this is cut example (the input):

/ByteRange[0 49443 61187 6424] /Contents<30820bb506092a864886f70d010702a0820ba630...>

This will be used by a preg_match_all() where the content comes from the pdf signature at the end of the file, and used the pattern will be

/\/ByteRange\[\s*(\d+) (\d+) (\d+) (\d+)] \/Contents\<\s*(\w+)>/is

The problem is that when I do the preg_match_all()...

preg_match_all($regexp, $signature_content, $result);

.. what I get doing var_dump() is an array with empty(?) values

array(6) { [0]=> array(0) { } [1]=> array(0) { } [2]=> array(0) { } [3]=> array(0) { } [4]=> array(0) { } [5]=> array(0) { } } array(0) { }

But with the same code, if I replace $signature_content with the copied/pasted string - it works and I get the array

array(6) { [0]=> array(1) { [0]=> string(11784) "/ByteRange[0 49443 61187 6424] /Contents<30820bb506092a864886f70d010..." } } array(1) { [0]=> string(23) "username_test" }

Anyone has any idea on how to solve this? Thanks in advance!

edit: here is the commented code - https://pastebin.com/jw1uw7Gb

1

There are 1 best solutions below

5
The fourth bird On

You use (\w+)> at the end of your pattern, but the same string ends with ba630...> and using \w does not match a dot.

Note that if you change the pattern delimiter to for example ~ then you don't have to escape the /

You also don't have to escape \<

What you can do to match both variants is to match optional dots at the end of the pattern outside of the last capture group

/ByteRange\[\s*(\d+) (\d+) (\d+) (\d+)] /Contents<\s*(\w+)\.*>

See a regex demo.

$signature_content = <<<DATA
/ByteRange[0 49443 61187 6424] /Contents<30820bb506092a864886f70d010702a0820ba630...>
DATA;
$regexp = "~/ByteRange\[\s*(\d+) (\d+) (\d+) (\d+)] /Contents<\s*(\w+)\.*>~i";
preg_match_all($regexp, $signature_content, $result);

var_dump($result);

Output

array(6) {
  [0]=>
  array(1) {
    [0]=>
    string(85) "/ByteRange[0 49443 61187 6424] /Contents<30820bb506092a864886f70d010702a0820ba630...>"
  }
  [1]=>
  array(1) {
    [0]=>
    string(1) "0"
  }
  [2]=>
  array(1) {
    [0]=>
    string(5) "49443"
  }
  [3]=>
  array(1) {
    [0]=>
    string(5) "61187"
  }
  [4]=>
  array(1) {
    [0]=>
    string(4) "6424"
  }
  [5]=>
  array(1) {
    [0]=>
    string(40) "30820bb506092a864886f70d010702a0820ba630"
  }
}