How to get <fo:block> and/or <fo:table> dimensions before they are rendered in xslfo?

408 Views Asked by At

More specifically the height of block elements that would be rendered in a pdf document using RenderX. Say for example I have this simple xslfo output with some text and a table:

<?xml version="1.0" encoding="ISO-8859-1"?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">

<fo:layout-master-set>
  <fo:simple-page-master master-name="A4">
    <fo:region-body />
  </fo:simple-page-master>
</fo:layout-master-set>

<fo:page-sequence master-reference="A4">
  <fo:flow flow-name="xsl-region-body">
    <fo:block>This is some simple text. How much space will I take?</fo:block>
  </fo:flow>
</fo:page-sequence>

<fo:table-header>
  <fo:table-row>
    <fo:table-cell>
      <fo:block font-weight="bold">Car</fo:block>
    </fo:table-cell>
    <fo:table-cell>
      <fo:block font-weight="bold">Price</fo:block>
    </fo:table-cell>
  </fo:table-row>
</fo:table-header>

<fo:table-body>
  <fo:table-row>
    <fo:table-cell>
      <fo:block>Volvo</fo:block>
    </fo:table-cell>
    <fo:table-cell>
      <fo:block>$50000</fo:block>
    </fo:table-cell>
  </fo:table-row>
  <fo:table-row>
    <fo:table-cell>
      <fo:block>SAAB</fo:block>
    </fo:table-cell>
    <fo:table-cell>
      <fo:block>$48000</fo:block>
    </fo:table-cell>
  </fo:table-row>
</fo:table-body>

</fo:table>
</fo:table-and-caption>
</fo:root>

Is there anyway that I can determine how much height the region body (i.e.: random text and table) is going to take on a rendered pdf using RenderX?

1

There are 1 best solutions below

0
On

You can't in every case determine the height of the content before it is formatted, but you can work it out after it's formatted.

In theory, if you had the font metrics for the fonts used (and a lot of time on your hands), you could write a program to work out how many characters would fit on each line and calculate how many lines you'd have on the page. However, if you're dealing with even moderately complex real-world documents, you'd have to deal with things like kerning, ligatures, intrusions from side floats, and the uncertain number of characters in a page-number cross-references to something on another page. You'd end up writing a second formatter to work out what the first formatter would do.

In practice, formatters make available their own representation (or representations) of the areas in the formatted document. The XSL 1.1 includes the concept (but not the specification) of an 'area tree' of the formatted areas. (See https://www.w3.org/TR/xsl11/#clear)

RenderX has its intermediate output format, documented at http://www.renderx.com/reference.html#IntermediateFormatSpecification

FOP has an intermediate output format, documented at https://xmlgraphics.apache.org/fop/2.4/intermediate.html. FOP also had a second area tree representation for use with tests, but right now I can't tell if that's still in use or not.

Antenna House has its Area Tree XML format, with schema at https://github.com/AntennaHouse/AreaTree and documentation at https://antennahouse.github.io/AreaTree/en/.

The Print and Page Layout Community Group produced a set of XSLT extension functions for running an XSL formatter and getting an area tree within your XSLT transformation. Code is at https://github.com/pplcg/XSLTExtensions and examples at https://www.w3.org/community/ppl/wiki/XSLTExtensions. Unfortunately for you, the extension functions only work with FOP and Antenna House.