CGPDFScanner - \x15 character while scanning

547 Views Asked by Swaroop At 07 July 2015 at 10:37

I am trying to extract the text of page 5 in pdf.
The pdf have a font YLJAAA+CMSY10 which has no mappings (CMap) or even encodings (default encoding or /Differences).
While extracting text, after string "tetex package" CGPDFScanner returns "\x15" character which is encountered many times.
When this character is encountered current font is the above mentioned font which has nothing to extract the text from pdf string. What is this \x15 character?

Thanks.

Original Q&A

There are 1 best solutions below

Jongware On 07 July 2015 at 11:41 BEST ANSWER

I found 2 (not "many") occurrences of this:

[ (\025) ] TJ

which is a number in octal – this is the number that is \x15 in hexadecimal.

The font definition for "YLJAA+CMSY10" in the PDF carries no special encoding, so it has the default encoding for "CMSY" ("Computer Modern Symbol"):

114 0 obj
<<
  /Type         /Font
  /Subtype      /Type1
  /BaseFont     210 0 R % -> "/YLJAAA+CMSY10"
  /FirstChar    0
  /FontDescriptor 211 0 R
  /LastChar     127
  /Widths       204 0 R
>>

211 0 obj
<<
  /Ascent       750
  /CapHeight    683
  /CharSet      (/bullet/greaterequal/arrowright/arrowdblright/element/negationslash/backslash/radical)
  /Descent      0
  /Flags        4
  /FontBBox     [ -29 -960 1116 775 ]
  /FontFile     205 0 R
  /FontName     210 0 R   % -> '/YLJAAA+CMSY10'
  /ItalicAngle  -14
  /StemV        85
  /XHeight      430
>>
endobj

In itself, this still says nothing definitive: a PDF producer may reorder glyphs and encodings at will, as long as it does the same with the embedded font). Assuming the font set is not reordered, checking a random list of CMxx encodings shows that the character code 0x1F could well be GREATER-THAN OR EQUAL TO (Unicode U+2265).

Acrobat agrees; inspecting the font in the PDF shows that character code 21 (decimal) is named 'GREATER-THAN OR EQUAL' and looks like it as well.

CGPDFScanner - \x15 character while scanning

There are 1 best solutions below

Related Questions in IOS

Related Questions in PDF

Related Questions in CGPDFSCANNER

Trending Questions

Popular # Hahtags

Popular Questions