I'm trying to convert XML report into PDF using XSLT and CSS. This PDF should include specific page numbering in the footer and e.g. page break after a particular table.
From what I found this can be achieved using CSS3 "at-attributes" for "paged media" (e.g. @page
). However, if I understand correctly, I might have a problem finding a tool that interprets these attributes to create PDF (not to mention that it needs to converts from XML first).
I found that I could use paged.js
script to have it working in a browser, but it works only if I run local server (e.g. live-server
) because of some local file restriction in all web browsers. I can (kind of) overcome this using command line switches like --allow-file-access-from-files
but it prints the document before rendering is complete (it looks like the browser do not wait for the paged.js
script to finish). I tried different switches: chrome.exe --headless --disable-gpu --allow-file-access-from-files --run-all-compositor-stages-before-draw --virtual-time-budget=100000 --print-to-pdf="<destination>" "<source>"
. Perhaps some switches for node engine could help?
My question is how can I programmatically convert the XML file to PDF, using XSLT to extract data of interest from XML, and CSS to format PDF as the proper paged document, using software with a free commercial license? Do I need paged.js
to accomplish it?
About my files:
In my XML file, I reference local XSL files that extract particular data from XML; they remove duplicates and sort them by date. This XSL file reference local CSS file to provide nice formatting and also "paged" attributes. XSL reference also paged.js
script and associated CSS, as in the script documentation.
I tried, between the other, weasyprint
, htmldoc
, html5-to-pdf
and wkhtmltopdf
without success.
I'm open for any suggestion.
EDIT: I was experimenting with XSL-FO (as suggested in the comment) and I have to admit that it works pretty well. It seems to me that page control is even more readable than in CSS. The only problem I see now is that it requires additional installations (Apache FOP and Java Runtime Environment). In my scenario it would be better to have FOP in .NET.
Anyway, I decided to describe in more detail my CSS-based solution with Chrome as a renderer because it does not requires additional installers (or at least that what I think). I already spend some time with it and it seems that it's almost working. Perhaps someone will spot where the problem is and then it would become a pretty nice solution for paged media with CSS.
Complete transform.xsl
file is:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<head>
<title>Summary</title>
<link href="css/style.css" rel="stylesheet" type="text/css"/>
<link href="css/interface.css" rel="stylesheet" type="text/css"/>
<script src="js/paged.polyfill.js"/>
</head>
<body>
<h1>Summary</h1>
<table class="summary">
<tr><td>Total Quantity:</td><td><xsl:value-of select="count(Results/Result)"/></td></tr>
<tr><td>Passed:</td><td><xsl:value-of select="count(Results/Result[Status='Pass'])"/></td></tr>
<tr><td>Failed:</td><td><xsl:value-of select="count(Results/Result[Status='Fail'])"/></td></tr>
</table>
<br />
<xsl:apply-templates/>
</body>
</html>
</xsl:template>
<xsl:template match="Results">
<table class="results">
<tr>
<th>Serial Number</th>
<th>Test Result</th>
<th>Date</th>
<th>Time</th>
</tr>
<xsl:for-each select="Result">
<tr>
<xsl:attribute name="class"><xsl:value-of select="./Status"/></xsl:attribute>
<td><xsl:value-of select="./SerialNumber"/></td>
<td><xsl:value-of select="./Status"/></td>
<td><xsl:value-of select="./Date"/></td>
<td><xsl:value-of select="./Time"/></td>
</tr>
</xsl:for-each>
</table>
</xsl:template>
</xsl:stylesheet>
In my style.css
I have @page
rules as below. There is more in my original file but it's not important here.
@page {
size: A4;
margin: 2cm 1cm;
@top-left {
content: "Summary Continued...";
font-size: 15px;
}
@bottom-center{
content: "Page " counter(page) "/" counter(pages);
font-size: 15px;
}
}
@page :first {
@top-left {
content: "";
}
}
And data is stored in Reports.xml
(see below). In this xml file you can put as many Result
fields as you want. I have 200 results in my file but I truncated it here to make it clearer.
<?xml version="1.0" encoding="iso-8859-1" ?><?xml-stylesheet type="text/xsl" href="xsl/transform.xsl"?>
<Results>
<Result ID="0">
<SerialNumber>8652280431</SerialNumber>
<Status>Fail</Status>
<Date>05-Mar-21</Date>
<Time>08:56:23</Time>
</Result>
<Result ID="1">
<SerialNumber>11124002643</SerialNumber>
<Status>Fail</Status>
<Date>05-Mar-21</Date>
<Time>08:56:23</Time>
</Result>
.
.
.
<Result ID="200">
<SerialNumber>6616001379</SerialNumber>
<Status>Fail</Status>
<Date>05-Mar-21</Date>
<Time>08:56:23</Time>
</Result>
</Results>
I also have in my files interface.css
and paged.polyfill.js
as described in the documentation (see transform.xsl
).
When I open my Results.xml
in Chrome with the command chrome.exe --allow-file-access-from-files <file path>
it works as excepted (see image below).
When I try to print my Results.xml
in Chrome with command chrome.exe --headless --disable-gpu --allow-file-access-from-files --run-all-compositor-stages-before-draw --virtual-time-budget=100000 --print-to-pdf=<destination> <source>
it generates unexpected results. Only couple pages are generated and total page count (counter(pages)
) returns 0 in the footer (see image below).
So maybe someone will figure how to make it work.
Perhaps --js-flags
will do the trick? Or maybe I should add/change something in paged.polyfill.js
?