Re: [xsl] Sample of grouping and sorting a relation captured in XML

Subject: Re: [xsl] Sample of grouping and sorting a relation captured in XML
From: Jeni Tennison <mail@xxxxxxxxxxxxxxxx>
Date: Fri, 5 Jan 2001 18:09:23 +0000
Hi Peter,

> (I'm afraid the actual XSL-rule got a bit unreadable because of the
> HTML-tags. I had some problems with that, but I did not want to go
> into that. However if you read past the HTML-tags, you'll see the
> way I did this. The HTML isn't great but it did display correct in
> Internet Explorer on Mac.)

One of the reasons the XSL rule is so unreadable is because of the way
that you're creating the HTML.  You seem to be creating it all as a
text file rather than building the HTML node tree and letting the XSLT
processor output it as HTML for you.

Looking at the code, I guess that the reason you've done it like this
is because you had problems building the table.  You're probably used
to procedural programming languages where, for example, you test
whether the current object is the first in a list in order to decide
whether to put a start tag, and test whether it's the last in the list
to decide where to put an end tag.

XSLT doesn't work like that: it's a declarative programming language
where you are building a node tree rather than outputting start and
end tags.  I've included below the stylesheet as I would have written
it.  You can see that it looks a lot tidier because it's being output
as HTML rather than as a text file.

Probably the most interesting part for you is the grouping/sorting
bit.  Here's the loop that creates the rows in the formatted data
table:

  <xsl:for-each
       select="departure/duration/fee
                [count(. |
                       key('DepartureByMonthAndFee',
                           concat(substring(
                                    normalize-space(../../text()), 1, 6),
                                    '::', normalize-space()))[1]
                       ) = 1]">
     <!-- for each fee with a unique departure month and fee -->
     <xsl:sort select="../../text()" data-type="number" />
     <xsl:sort select="." data-type="number" />
     <tr>
        <td><xsl:apply-templates select="../.." mode="month" /></td>
        <td><xsl:apply-templates select="../.." mode="day" /></td>
        <td><xsl:apply-templates select="." /></td>
     </tr>
  </xsl:for-each>

The (fairly hideous) select expression identifies the unique month-fee
combinations using the Muenchian method.  If you want to learn more
about it, have a look at
http://www.jenitennison.com/xslt/grouping/muenchian.html.

The days are output using the template:

<xsl:template match="departure" mode="day">
   <xsl:variable name="days"
                 select="key('DepartureByMonthAndFee',
                             concat(
                               substring(normalize-space(text()), 1, 6),
                               '::', normalize-space(duration/fee)))" />
   <xsl:choose>
      <xsl:when test="$days[2]">
         <xsl:for-each select="$days">
            <xsl:sort select="substring(normalize-space(../../text()), 7, 2)"
                      data-type="number" />
            <xsl:choose>
               <xsl:when test="position() = 1">
                  <xsl:apply-templates select="../.." mode="format-day" />
               </xsl:when>
               <xsl:when test="position() = last()">
                  <xsl:text>-</xsl:text>
                  <xsl:apply-templates select="../.." mode="format-day" />
               </xsl:when>
            </xsl:choose>
         </xsl:for-each>
      </xsl:when>
      <xsl:otherwise>
         <xsl:apply-templates select="." mode="format-day" />
      </xsl:otherwise>
   </xsl:choose>
</xsl:template>

There might well be a nicer way to do it.

If you're in charge of the DTD (looks as if you are) then you might
consider changing it to make it easier for XSLT to handle.  It would
be a lot easier if the departures looked like:

<departure>
  <date>...</date>
  <duration>
    <days>...</days>
    <fee>...</fee>
  </duration>
</departure>

or:

<departure date="...">
  <duration fee="..." days="..." />
  <duration fee="..." days="..." />
  ...
</departure>

Or something similar that doesn't mix content. As you've seen with all
the select expressions that end with text() or node(), XSLT doesn't
handle getting data out of mixed content very well. If you separate
the data out into subelements, you can just use the value of those
nodes rather than going to their text() children. Also, with one of
the formats above, whitespace isn't as much of a problem, so you don't
have to normalize-space() everywhere.

I hope the stylesheet below at least gives you an insight into a
different way of solving the problem.  If you want to discuss any of
it, feel free to post any comments or questions to the list.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                xmlns:m="http://www.jenitennison.com/months";
                exclude-result-prefixes="m">

<xsl:output method="html" indent="yes" />

<xsl:template match="journey">
   <html>
      <head>
         <title>Sample Table</title>
      </head>
      <body>
         <p>
            This example is from the travelling business. It is the result of an XSL transformation that processes a property of a journey, i.e. a relation: 'Fee by departure'.
         </p>
         <p>
            The first data cell in the table show the raw data. The second data cell show formatted data.
         </p>
         <p>
            The raw data is an (unsorted) relation with signature Departure x Duration -> Fee.
         </p>
         <p>
            In XML it looks like:
         </p>
         <pre>
&lt;journey&gt;
&lt;fees&gt;
&lt;departures&gt;
&lt;departure&gt;20010
312
    &lt;duration&gt;14
      &lt;fee&gt;1598.00&lt;/fee&gt;

&lt;/duration&gt;
&lt;/departure&gt;
&lt;departure&gt;20001004

&lt;duration&gt;14
      &lt;fee&gt;2000.00&lt;/fee&gt;

&lt;/duration&gt;
&lt;/departure&gt;
&lt;departure&gt;20001018

&lt;duration&gt;14
      &lt;fee&gt;1000.00&lt;/fee&gt;

&lt;/duration&gt;
&lt;/departure&gt;
...
&lt;/fees&gt;
&lt;/journey&gt;
         </pre>
         <p>
            The formatted data groups departures in the same month with the same fee. The resulting groups are sorted by month (and year).
         </p>
         <p>
            N.b.: The duration of a journey does not play a role in this example, but it is in the relation for other purposes.
         </p>
         <table border="2">
            <tr>
               <td colspan='2' align='center'>Journey data: Fee by departure</td>
            </tr>
            <td valign="top">
               <xsl:apply-templates select="fees/departures" mode="relation" />
            </td>
            <td valign='top'>
               <xsl:apply-templates select="fees/departures" mode="table" />
            </td>
         </table>
      </body>
   </html>
</xsl:template>

<xsl:template match="departures" mode="relation">
   <table border='2' cellpadding='6'>
      <tr>
         <td colspan='3' align='center'>Raw data</td>
      </tr>
      <tr>
         <td align='center'>Departure</td>
         <td align='center'>Duration</td>
         <td align='center'>Fee</td>
      </tr>
      <xsl:for-each select="departure/duration/fee">
         <xsl:sort select="../../text()" data-type="number" />
         <td><xsl:value-of select="../../text()" /></td>
         <td><xsl:value-of select="../text()" /></td>
         <td><xsl:value-of select="." /></td>
      </xsl:for-each>
   </table>
</xsl:template>

<xsl:key name='DepartureByMonthAndFee'
         match='fee'
         use='concat(substring(normalize-space(../../text()), 1, 6), "::", normalize-space())'/>

<xsl:template match="departures" mode="table">
   <table border='2' cellpadding='6'>
      <tr>
         <td colspan='3' align='center'>Formatted data</td>
      </tr>
      <tr>
         <td align='center'>Month</td>
         <td align='center'>Day</td>
         <td align='center'>Fee</td>
      </tr>
      <xsl:for-each select="departure/duration/fee[count(. | key('DepartureByMonthAndFee', concat(substring(normalize-space(../../text()), 1, 6), '::', normalize-space()))[1]) = 1]">
         <!-- for each fee with a unique departure month and fee -->
         <xsl:sort select="../../text()" data-type="number" />
         <xsl:sort select="." data-type="number" />
         <tr>
            <td><xsl:apply-templates select="../.." mode="month" /></td>
            <td><xsl:apply-templates select="../.." mode="day" /></td>
            <td><xsl:apply-templates select="." /></td>
         </tr>
      </xsl:for-each>
   </table>
</xsl:template>

<m:month>January</m:month>
<m:month>February</m:month>
<m:month>March</m:month>
<m:month>April</m:month>
<m:month>May</m:month>
<m:month>June</m:month>
<m:month>July</m:month>
<m:month>August</m:month>
<m:month>September</m:month>
<m:month>October</m:month>
<m:month>November</m:month>
<m:month>December</m:month>

<xsl:variable name="months" select="document('')/*/m:month" />

<xsl:template match="departure" mode="month">
   <xsl:value-of select="$months[number(substring(normalize-space(current()/text()), 5, 2))]" />
</xsl:template>

<xsl:template match="departure" mode="day">
   <xsl:variable name="days" select="key('DepartureByMonthAndFee', concat(substring(normalize-space(text()), 1, 6), '::', normalize-space(duration/fee)))" />
   <xsl:choose>
      <xsl:when test="$days[2]">
         <xsl:for-each select="$days">
            <xsl:sort select="substring(normalize-space(../../text()), 7, 2)" data-type="number" />
            <xsl:choose>
               <xsl:when test="position() = 1">
                  <xsl:apply-templates select="../.." mode="format-day" />
               </xsl:when>
               <xsl:when test="position() = last()">
                  <xsl:text />-<xsl:apply-templates select="../.." mode="format-day" />
               </xsl:when>
            </xsl:choose>
         </xsl:for-each>
      </xsl:when>
      <xsl:otherwise>
         <xsl:apply-templates select="." mode="format-day" />
      </xsl:otherwise>
   </xsl:choose>
</xsl:template>

<xsl:template match="departure" mode="format-day">
   <xsl:value-of select="format-number(substring(normalize-space(text()), 7, 2), '#0')" />
</xsl:template>

<xsl:template match="fee">
   <xsl:value-of select='round(normalize-space())'/>,--<xsl:text />
</xsl:template>

</xsl:stylesheet>



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread