Separated italic and span tags inside list element in xslt

102 Views Asked by At

I am new in XSLT world, thank you in advance for your understending. I need to prepare a xml which will be send to Adobe InDesign server. In the html files, which are my input that I need to transform to xml and send to Adobe InDesign by using XSLT transformation, I have "li" elements that have "span" tags and "i" (italic) tags inside. I would like to treat "i" tags, to be italics in the final xml for InDesign. I tried to match "i" tags by the following xslt:

<xsl:template match="i" mode="process-text">
      <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Italic">
        <Content>      
           <xsl:copy-of select="text()"/>
        </Content>
    </CharacterStyleRange>
</xsl:template>

but without results.

For example, I have the following input:

<li class="MsoNormal" style="mso-list:l0 level2 lfo1;tab-stops:list 1.0in">Systolic dysfunction: an&#xa0;<i>inotropic</i>&#xa0;abnormality, due to myocardial infarction (MI) or dilated or ischemic cardiomyopathy (CM), resulting in diminished systolic emptying (ejection fraction &lt;45%).</li>

I would like to transform it to the following one:

<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/BL2">
         <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
            <Content>Systolic dysfunction: an </Content>
            <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Italic">
                <Content>inotropic</Content>
            </CharacterStyleRange>
        <Content> abnormality, due to myocardial infarction (MI) or dilated or ischemic cardiomyopathy (CM), resulting in diminished systolic emptying (ejection fraction &lt;45%).</Content>
            <Br/>
         </CharacterStyleRange>
   </ParagraphStyleRange>

My initial problem is how to split a "li" tag and treat (separately) the text inside, and also treat separately "span" and "i" tags inside "li" by XSLT? Thank you in advance for any help.

UPDATE: My main template, for "li" elements is:

<xsl:template match="li[not(descendant::p) and not(ancestor::section[@class='references' or @class='References'])]" mode="li-pass1">    
       <xsl:variable name="depth" select="count(ancestor::li) + 1"/>
    
    <xsl:variable name="listType">
      <xsl:choose>
        <xsl:when test="parent::ol">
          <xsl:value-of select="'NL'"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="'BL'"/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:variable>
    
      <ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/{$listType}{if ($depth eq 1) then '' else $depth}">          
      <xsl:choose>
        <xsl:when test="descendant::i/text()">
          <Content>      
             <xsl:copy-of select="./text() | descendant::span/text() "/>
          </Content>
      <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Italic">
              <Content>      
                  <xsl:copy-of select="descendant::i/text()"/>
              </Content>
          </CharacterStyleRange>
        </xsl:when>
        <xsl:otherwise>
          <Content>      
             <xsl:copy-of select="./text() | descendant::span/text() "/>
          </Content>
        </xsl:otherwise>
      </xsl:choose>
      </ParagraphStyleRange>
    </xsl:template>

This template affects final xml in a wrong way. I got the following result:

<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/BL">
         <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
            <Content>Two potential pathophysiologic conditions lead to the clinical findings of HF, namely systolic and/or diastolic heart dysfunction. 
          </Content>
         </CharacterStyleRange>
         <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Italic">
            <Content>inotropiccompliance</Content>
         </CharacterStyleRange>
         <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]"/>
      </ParagraphStyleRange>
      <ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/BL2">
         <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
            <Content>Systolic dysfunction: an  abnormality, due to myocardial infarction (MI) or dilated or ischemic cardiomyopathy (CM), resulting in diminished systolic emptying (ejection fraction &lt;45%).</Content>
         </CharacterStyleRange>
         <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Italic">
            <Content>inotropic</Content>
         </CharacterStyleRange>
         <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]"/>
      </ParagraphStyleRange>

So, you can see, italic elements are in a separate tag, but without other content. Could you please suggest what I need to do?

1

There are 1 best solutions below

0
On

I would try to write templates mapping each element type to the corresponding result structure and inside use <xsl:apply-templates/> to keep processing up. So the basic approach for that sample would look like

<xsl:template match="li">
    <xsl:variable name="depth" select="count(ancestor::li) + 1"/>
    
    <xsl:variable name="listType">
      <xsl:choose>
        <xsl:when test="parent::ol">
          <xsl:value-of select="'NL'"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="'BL'"/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:variable>
    
     <ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/{$listType}{if ($depth eq 1) then '' else $depth}">
         <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
           <xsl:apply-templates/>
         </CharacterStyleRange>   
     </ParagraphStyleRange>
</xsl:template>

<xsl:template match="i">
      <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Italic">
        <Content>      
           <xsl:apply-templates/>
        </Content>
    </CharacterStyleRange>
</xsl:template>

<xsl:template match="text()[normalize-space()]">
    <Content>
        <xsl:value-of select="."/>
    </Content>
</xsl:template>

https://xsltfiddle.liberty-development.net/93dFK9Q

That gives

<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/BL">
   <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No character style]">
      <Content>Systolic dysfunction: an </Content>
      <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/Italic">
         <Content>
            <Content>inotropic</Content>
         </Content>
      </CharacterStyleRange>
      <Content> abnormality, due to myocardial infarction (MI) or dilated or ischemic cardiomyopathy (CM), resulting in diminished systolic emptying (ejection fraction &lt;45%).</Content>
   </CharacterStyleRange>
</ParagraphStyleRange>

I might not have captured all details of your needed output format but I hope the sample shows that the key is to use apply-templates to process child nodes with matching templates.