Grouping a list of items based on a space separated list of tags and else

87 Views Asked by At

The problem has multiple facets:

  1. How to categorize based on specific space separated contents of a tag
  2. How to categorize for lack of such specific content.

As an example, take the following data:

<messages>
  <m> 
    <subject>message tagged with A B C</subject>
    <tags>A B C</tags>
  </m>

  <m> 
    <subject>message tagged with B C D</subject>
    <tags>B C D</tags>
  </m>

  <m> 
    <subject>message tagged with X Y A</subject>
    <tags>X Y A</tags>
  </m>

  <m> 
    <subject>message tagged with C X</subject>
    <tags>C X</tags>
  </m>

  <m>
    <subject>message tagged with Y</subject>
    <tags>Y</tags>
  </m>

</messages>

Given a known set of tags, say

<xsl:param name="pKnownTags">
  <t>A</t>
  <t>B</t>
</xsl:param>

I want to generate an output that would look like:

Messages tagged with A:
* message tagged with A B C
* message tagged with X Y A

Messages tagged with B:
* message tagged with A B C
* message tagged with B C D

Messages tagged with neither:
* message tagged with C X
* message tagged with Y 

Using EXSLT is fine, but otherwise need 1.0 solution. Is this possible?

2

There are 2 best solutions below

4
On BEST ANSWER

This doesn't require anything too fancy. Please give the below a try:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:exsl="http://exslt.org/common" exclude-result-prefixes="exsl"
>
  <xsl:output method="text" indent="yes"/>

  <xsl:param name="pKnownTags">
    <t>A</t>
    <t>B</t>
  </xsl:param>
  <xsl:variable name="pKnownTagsNodeSet" select="exsl:node-set($pKnownTags)/t" />

  <xsl:template match="/messages">
    <xsl:apply-templates select="$pKnownTagsNodeSet">
      <xsl:with-param name="docEl" select="." />
    </xsl:apply-templates>

    <xsl:text>Messages tagged with none of the above:&#xA;</xsl:text>
    <xsl:apply-templates select="m" mode="checkAbsence" />
  </xsl:template>

  <xsl:template match="t">
    <xsl:param name="docEl" select="/.." />

    <xsl:value-of select="concat('Messages tagged with ', ., ':&#xA;')"/>
    <xsl:apply-templates select="$docEl/m[contains(concat(' ', tags, ' '),
                                                   concat(' ', current(), ' '))]" />
    <xsl:text>&#xA;</xsl:text>
  </xsl:template>

  <xsl:template match="m" mode="checkAbsence">
    <xsl:variable name="currentTagsPadded" select="concat(' ', tags, ' ')" />
    <xsl:apply-templates
          select="(.)[not($pKnownTagsNodeSet[contains($currentTagsPadded,
                                                      concat(' ', ., ' '))]
                         )
                     ]" />
  </xsl:template>

  <xsl:template match="m">
    <xsl:value-of select="concat('* ', subject, '&#xA;')"/>
  </xsl:template>

</xsl:stylesheet>

when run on your sample input, this produces:

Messages tagged with A:
* message tagged with A B C
* message tagged with X Y A

Messages tagged with B:
* message tagged with A B C
* message tagged with B C D

Messages tagged with none of the above:
* message tagged with C X
* message tagged with Y
0
On

Using EXSLT is fine

Well, if your processor supports the EXSLT str:tokenize() function, then this could be quite simple:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
xmlns:str="http://exslt.org/strings"
extension-element-prefixes="exsl str">
<xsl:output method="text" encoding="UTF-8"/>

<xsl:param name="pKnownTags">
    <t>A</t>
    <t>B</t>
</xsl:param>

<xsl:variable name="tags-set" select="exsl:node-set($pKnownTags)/t" />
<xsl:variable name="xml" select="/" />

<xsl:key name="message-by-tags" match="m" use="str:tokenize(tags)" />

<xsl:template match="/">
    <!-- matching messages -->
    <xsl:for-each select="$tags-set">
        <xsl:variable name="tag" select="." />
        <xsl:value-of select="concat('Messages tagged with ', $tag, ':&#10;')"/>
        <!-- switch context to source document in order to use key -->
        <xsl:for-each select="$xml">
            <xsl:for-each select="key('message-by-tags', $tag)">
                <xsl:value-of select="concat('* ', subject, ':&#10;')"/>
            </xsl:for-each>
        </xsl:for-each>
        <xsl:text>&#10;</xsl:text>
    </xsl:for-each>

    <!-- non-matching messages -->
    <xsl:text>Messages tagged with none:&#10;</xsl:text>
    <xsl:for-each select="messages/m[not(str:tokenize(tags)=$tags-set)]">
        <xsl:value-of select="concat('* ', subject)"/>
        <xsl:if test="position()!=last()">
            <xsl:text>&#10;</xsl:text>
        </xsl:if>
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>