How can I debug why a PDF is broken in weasyprint?

1.8k Views Asked by At

It is getting quiet frustrating, I though it would be easy to just create reports in python using HTML and CSS but that is taking way too long.

At the begining I thoug it was a conflic with bootstrap, so I strated writing everything from scratch in the styling.

I used as a base the code of the official repository:

https://github.com/Kozea/WeasyPrint/tree/gh-pages/samples/report

And from there, I just create a new section in the HTML code by duplicating some lines and changing the ids, something like adding this:

<article id="execution-results">
    <h2 id="execution-results-title">Execution Results</h2>
</article>

And I already find in firefox this document might not be displayed correctly and you can not read anything. In adobe reader There was an error processing a page. There was a problem readin the document(110). I can still open it in foxit reader and it will be displayed properly, but for me is usless if it is corrupted in all systems.

I have enabled the debug mode by using @python -m weasyprint .\report.html .\report.pdf --debug@. But I do not get any useful information, exmaple of a successfull run (without adding that code):

INFO: Step 1 - Fetching and parsing HTML - .\report.html
INFO: Step 2 - Fetching and parsing CSS - file:///.../static/styles/report.css
INFO: Step 3 - Applying CSS
INFO: Step 4 - Creating formatting structure
ERROR: Content discarded: target points to undefined anchor "('string', '#execution-results')"
ERROR: Content discarded: target points to undefined anchor "('string', '#execution-results')"
INFO: Step 5 - Creating layout - Page 1
INFO: Step 5 - Creating layout - Page 2
INFO: Step 5 - Creating layout - Page 3
INFO: Step 5 - Creating layout - Page 4
INFO: Step 5 - Creating layout - Page 5
INFO: Step 5 - Creating layout - Page 6
INFO: Step 5 - Creating layout - Page 7
INFO: Step 5 - Creating layout - Repagination #1
INFO: Step 5 - Creating layout - Page 1 (up-to-date)
INFO: Step 5 - Creating layout - Page 2 (up-to-date)
INFO: Step 5 - Creating layout - Page 3
INFO: Step 5 - Creating layout - Page 4 (up-to-date)
INFO: Step 5 - Creating layout - Page 5 (up-to-date)
INFO: Step 5 - Creating layout - Page 6 (up-to-date)
INFO: Step 5 - Creating layout - Page 7 (up-to-date)
INFO: Step 5 - Creating layout - Page 8 (up-to-date)
INFO: Step 6 - Drawing
ERROR: No anchor #execution-results for internal URI reference
ERROR: No anchor #execution-results for internal URI reference
ERROR: No anchor #execution-results for internal URI reference

The same producing a corrupted file when I add the mentioned lines:

INFO: Step 1 - Fetching and parsing HTML - .\report.html
INFO: Step 2 - Fetching and parsing CSS - file:///C:/.../static/styles/report.css
INFO: Step 3 - Applying CSS
INFO: Step 4 - Creating formatting structure
INFO: Step 5 - Creating layout - Page 1
INFO: Step 5 - Creating layout - Page 2
INFO: Step 5 - Creating layout - Page 3
INFO: Step 5 - Creating layout - Page 4
INFO: Step 5 - Creating layout - Page 5
INFO: Step 5 - Creating layout - Page 6
INFO: Step 5 - Creating layout - Page 7
INFO: Step 5 - Creating layout - Page 8
INFO: Step 5 - Creating layout - Page 9
INFO: Step 5 - Creating layout - Repagination #1
INFO: Step 5 - Creating layout - Page 1 (up-to-date)
INFO: Step 5 - Creating layout - Page 2 (up-to-date)
INFO: Step 5 - Creating layout - Page 3
INFO: Step 5 - Creating layout - Page 4 (up-to-date)
INFO: Step 5 - Creating layout - Page 5 (up-to-date)
INFO: Step 5 - Creating layout - Page 6 (up-to-date)
INFO: Step 5 - Creating layout - Page 7 (up-to-date)
INFO: Step 5 - Creating layout - Page 8 (up-to-date)
INFO: Step 5 - Creating layout - Page 9 (up-to-date)
INFO: Step 6 - Drawing
INFO: Step 7 - Adding PDF metadata

How can I debug what is going on and why do many pdf readers complain if the HTML seems to be fine, I can open the HTML in the inspector and everything looks ok and I do not get any error in the console?

Update: Actually after trying to create a reproducible example, using the example in the repository, it already creates the problem. This sind the steps I did to reproduce it:

git clone https://github.com/Kozea/WeasyPrint.git
git checkout gh-pages
cd \WeasyPrint\samples\report
python -m weasyprint .\report.html .\report.pdf --debug

The I open it with adobe acrobat reader and I get this:

There was an error porcessing a page. There was a problem reading this document (110).

The debug mode just shows this:

INFO: Step 1 - Fetching and parsing HTML - .\report.html
INFO: Step 2 - Fetching and parsing CSS - file:///...WeasyPrint/samples/report/report.css
WARNING: Ignored `text-decoration: inherit` at 153:11, invalid value.
INFO: Step 3 - Applying CSS
INFO: Step 4 - Creating formatting structure
INFO: Step 5 - Creating layout - Page 1
INFO: Step 5 - Creating layout - Page 2
INFO: Step 5 - Creating layout - Page 3
INFO: Step 5 - Creating layout - Page 4
INFO: Step 5 - Creating layout - Page 5
INFO: Step 5 - Creating layout - Page 6
INFO: Step 5 - Creating layout - Page 7
INFO: Step 5 - Creating layout - Page 8
INFO: Step 5 - Creating layout - Repagination #1
INFO: Step 5 - Creating layout - Page 1 (up-to-date)
INFO: Step 5 - Creating layout - Page 2 (up-to-date)
INFO: Step 5 - Creating layout - Page 3
INFO: Step 5 - Creating layout - Page 4 (up-to-date)
INFO: Step 5 - Creating layout - Page 5 (up-to-date)
INFO: Step 5 - Creating layout - Page 6 (up-to-date)
INFO: Step 5 - Creating layout - Page 7 (up-to-date)
INFO: Step 5 - Creating layout - Page 8 (up-to-date)
INFO: Step 6 - Drawing
INFO: Step 7 - Adding PDF metadata

The warning has no problem. After an hour more, commenting/uncommeting the whole CSS file, I found that the issue came from this:

      html body article#columns section p:first-of-type {
        font-weight: 700; }
      html body article#columns section p:first-of-type {
        font-weight: 200; }

So after changing that I have a working pdf. Now I want to duplicate a page. So I do the following: In the HTML:

    <article id="chapter">
      <h2 id="chapter-title">This is a chapter of a new section</h2>
    </article>

    <!-- I add this -->
    <article id="chapter-2">
      <h2 id="chapter-2-title">This is a chapter of a new section</h2>
    </article>

In the style:

@page chapter {
  background: #fbc847;
  margin: 0;
  @top-left {
    content: none; }
  @top-center {
    content: none; }
  @top-right {
    content: none; } }

/*I add this*/
@page chapter-2 {
  background: #fbc847;
  margin: 0;
  @top-left {
    content: none; }
  @top-center {
    content: none; }
  @top-right {
    content: none; } }

  html body article#chapter {
    align-items: center;
    display: flex;
    height: 297mm;
    justify-content: center;
    page: chapter; }  
/*I add this*/
  html body article#chapter-2 {
    align-items: center;
    display: flex;
    height: 297mm;
    justify-content: center;
    page: chapter-2; }

Un the PDF is broken again in Acrobat or firefox. And there is no error in the logs at all to start debugging.

Is there a way to make this developer friendly and see what can go wrong?

0

There are 0 best solutions below