Subject: RE: [xsl] need help on stylesheet efficiency From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx> Date: Thu, 25 Jul 2002 19:21:10 +0100 |
Ouch. The title says it all. <xsl:for-each select="/document/record/Flags[not(preceding::Flags=.)]"> <xsl:variable name="Flags_1" select="."/> <xsl:if test="/document[record[Flags=$Flags_1]]"> <xsl:for-each select="/document/record/SPM_RegioId[not(preceding::SPM_RegioId=.)]"> <xsl:variable name="SPM_RegioId_2" select="."/> <xsl:if test="/document[record[Flags=$Flags_1][SPM_RegioId=$SPM_RegioId_2]]"> <xsl:for-each select="/document/record/SPM_DeviceId [not(preceding::SPM_DeviceId=.)]"> and lots more of the same. Your first for-each is processing all the Flags elements that differ from a previous Flags element. The first improvement you can make is to change it to: select="/document/record[not(Flags = preceding-sibling::record/Flags)]/Flags"> preceding-sibling involves a much shorter search than preceding. But you can do much better than this using keys. Look up Muenchian grouping to see how you can select the distinct Flags values with the help of a key. Now look at the xsl:if. This is saying "if the document contains a record whose Flags value is equal to this one". Well of course it does, but the poor old XSLT processor is having to do a lot of work to prove it. Now look at the second for-each. Remember that you are executing this once for every distinct Flags value. This is saying "for every distinct SPM_RegioId in the document..." As before, you can find these much more efficiently using a key. But more to the point, you don't need to find them afresh for each distinct Flags value, because you'll get the same answer each time. Basically, you don't need nested for-each constructs at all, because the select expression in each one is absolute rather than relative. And so it goes on, to 15 levels of nesting. Since each level of for-each has O(n^2) with respect to the size of the document, and the xsl:if adds another O(n), I think the final complexity is O(n^45), which I think must be some kind of record. It means that if you double the size of the source document, processing will take about 30 million billion times as long. If Xalan finished after 3 minutes, it was doing remarkably well. Michael Kay Software AG home: Michael.H.Kay@xxxxxxxxxxxx work: Michael.Kay@xxxxxxxxxxxxxx > -----Original Message----- > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of > Malia Zaheer > Sent: 25 July 2002 17:53 > To: XSL-List@xxxxxxxxxxxxxxxxxxxxxx > Subject: [xsl] need help on stylesheet efficiency > > > Hi, > > I have a stylesheet that I use to process large xml files > that are larger than 1MB. Using Xalan, it takes 3 minutes > and 40 seconds to transform only 75KB xml. I was wondering > if people on this list can help me with improving the > efficiency of my stylesheet. Here it is: > > <?xml version="1.0" encoding="UTF-8"?> > <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > exclude-result-prefixes="java" version="1.0" > xmlns:java="http://xml.apache.org/xslt/java"><xsl:output > indent="yes" method="xml"/><xsl:template > match="/"><xsl:element name="document"><xsl:call-template > name="template_1"/></xsl:element></xsl:template> > > <xsl:template name="template_1"> > <xsl:for-each > select="/document/record/Flags[not(preceding::Flags=.)]"> > <xsl:variable name="Flags_1" select="."/> > <xsl:if test="/document[record[Flags=$Flags_1]]"> > <xsl:for-each > select="/document/record/SPM_RegioId[not(preceding::SPM_RegioId=.)]"> > <xsl:variable name="SPM_RegioId_2" select="."/> > <xsl:if > test="/document[record[Flags=$Flags_1][SPM_RegioId=$SPM_RegioId_2]]"> > <xsl:for-each > select="/document/record/SPM_DeviceId[not(preceding::SPM_Devic > eId=.)]"> > <xsl:variable name="SPM_DeviceId_3" select="."/> > <xsl:if test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags > =$Flags_1][SPM_Reg > ioId=$SPM_RegioId_2]]"> > <xsl:for-each > select="/document/record/SUB_Instance[not(preceding::SUB_Insta > nce=.)]"> > <xsl:variable name="SUB_Instance_4" select="."/> > <xsl:if test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags > =$Flags_1][SUB_Ins > tance=$SUB_Instance_4][SPM_RegioId=$SPM_RegioId_2]]"> > <xsl:for-each > select="/document/record/SPM_SubId[not(preceding::SPM_SubId=.)]"> > <xsl:variable name="SPM_SubId_5" select="."/> > <xsl:if test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags > =$Flags_1][SUB_Ins > tance=$SUB_Instance_4][SPM_SubId=$SPM_SubId_5][SPM_RegioId > =$SPM_RegioId_2]]" > > > <xsl:for-each > select="/document/record/SPM_IspId[not(preceding::SPM_IspId=.)]"> > <xsl:variable name="SPM_IspId_6" select="."/><xsl:if > test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags > =$Flags_1][SUB_Ins > tance=$SUB_Instance_4][SPM_SubId=$SPM_SubId_5][SPM_RegioId > =$SPM_RegioId_2][S > PM_IspId=$SPM_IspId_6]]"> > <xsl:for-each > select="/document/record/TimeStamp[not(preceding::TimeStamp > =.)]"><xsl:variab > le name="TimeStamp_7" select="."/> > <xsl:if test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags > =$Flags_1][SUB_Ins > tance=$SUB_Instance_4][SPM_SubId=$SPM_SubId_5][SPM_RegioId > =$SPM_RegioId_2][T imeStamp=$TimeStamp_7][SPM_IspId=$SPM_IspId_6]]"> > <xsl:for-each > select="/document/record/SPM_TRUNKID[not(preceding::SPM_TRUNKID > =.)]"><xsl:va > riable name="SPM_TRUNKID_8" select="."/> > <xsl:if test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags > =$Flags_1][SPM_TRU > NKID=$SPM_TRUNKID_8][SUB_Instance=$SUB_Instance_4][SPM_SubId > =$SPM_SubId_5][S > PM_RegioId=$SPM_RegioId_2][TimeStamp=$TimeStamp_7][SPM_IspId > =$SPM_IspId_6]]" > > > <xsl:for-each > select="/document/record/IFI_IPACKETS[not(preceding::IFI_IPACKETS > =.)]"><xsl: > variable name="IFI_IPACKETS_9" select="."/> > <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS > =$IFI_IPACKETS_9] > [Flags=$Flags_1][SPM_RegioId=$SPM_RegioId_2][SPM_TRUNKID > =$SPM_TRUNKID_8][SUB > _Instance=$SUB_Instance_4][SPM_DeviceId=$SPM_DeviceId_3][SPM_IspId > =$SPM_IspI > d_6][TimeStamp=$TimeStamp_7]]"> > <xsl:for-each > select="/document/record/IFI_OPACKETS[not(preceding::IFI_OPACK > ETS=.)]"> > <xsl:variable name="IFI_OPACKETS_10" select="."/><xsl:if > test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS > =$IFI_IPACKETS_9] > [IFI_OPACKETS=$IFI_OPACKETS_10][Flags=$Flags_1][SPM_RegioId=$S > PM_RegioId_2] > [ > SPM_TRUNKID=$SPM_TRUNKID_8][SUB_Instance=$SUB_Instance_4][SPM_DeviceId > =$SPM_ DeviceId_3][SPM_IspId=$SPM_IspId_6][TimeStamp=$TimeStamp_7]]"> > <xsl:for-each > select="/document/record/IFI_IBYTES[not(preceding::IFI_IBYTES > =.)]"><xsl:vari > able name="IFI_IBYTES_11" select="."/> > <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS > =$IFI_IPACKETS_9] > [IFI_OPACKETS=$IFI_OPACKETS_10][Flags=$Flags_1][SPM_RegioId=$S > PM_RegioId_2] > [ IFI_IBYTES=$IFI_IBYTES_11][SPM_TRUNKID=$SPM_TRUNKID_8][SUB_Instance > =$SUB_Ins > tance_4][SPM_DeviceId=$SPM_DeviceId_3][SPM_IspId=$SPM_IspId_6] > [TimeStamp > =$Ti > meStamp_7]]"> > <xsl:for-each > select="/document/record/IFI_OBYTES[not(preceding::IFI_OBYTES=.)]"> > <xsl:variable name="IFI_OBYTES_12" select="."/><xsl:if > test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS > =$IFI_IPACKETS_9] > [IFI_OBYTES=$IFI_OBYTES_12][IFI_OPACKETS=$IFI_OPACKETS_10][Flags > =$Flags_1][S > PM_RegioId=$SPM_RegioId_2][IFI_IBYTES=$IFI_IBYTES_11][SPM_TRUNKID > =$SPM_TRUNK > ID_8][SUB_Instance=$SUB_Instance_4][SPM_DeviceId=$SPM_DeviceId > _3][SPM_IspId > = > $SPM_IspId_6][TimeStamp=$TimeStamp_7]]"><xsl:for-each > select="/document/record/IFI_IQDROPS[not(preceding::IFI_IQDROPS > =.)]"><xsl:va > riable name="IFI_IQDROPS_13" select="."/> > <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS > =$IFI_IPACKETS_9] > [IFI_OBYTES=$IFI_OBYTES_12][IFI_OPACKETS=$IFI_OPACKETS_10][Flags > =$Flags_1][I > FI_IQDROPS=$IFI_IQDROPS_13][SPM_RegioId=$SPM_RegioId_2][IFI_IBYTES > =$IFI_IBYT > ES_11][SPM_TRUNKID=$SPM_TRUNKID_8][SUB_Instance > =$SUB_Instance_4][SPM_DeviceI > d=$SPM_DeviceId_3][SPM_IspId=$SPM_IspId_6][TimeStamp > =$TimeStamp_7]]"><xsl:fo > r-each select="/document/record/IFI_OQDROPS[not(preceding::IFI_OQDROPS > =.)]"> > <xsl:variable name="IFI_OQDROPS_14" select="."/> > <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS > =$IFI_IPACKETS_9] > [IFI_OBYTES=$IFI_OBYTES_12][IFI_OPACKETS=$IFI_OPACKETS_10][Flags > =$Flags_1][I > FI_IQDROPS=$IFI_IQDROPS_13][SPM_RegioId=$SPM_RegioId_2][IFI_IBYTES > =$IFI_IBYT > ES_11][SPM_TRUNKID=$SPM_TRUNKID_8][SUB_Instance > =$SUB_Instance_4][SPM_DeviceI > d=$SPM_DeviceId_3][SPM_IspId=$SPM_IspId_6][TimeStamp > =$TimeStamp_7][IFI_OQDRO > PS=$IFI_OQDROPS_14]]"> > <xsl:for-each > select="/document/record/PKTS_DROP_ERR[not(preceding::PKTS_DROP_ERR > =.)]"><xs > l:variable name="PKTS_DROP_ERR_15" select="."/> > <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS > =$IFI_IPACKETS_9] > [IFI_OBYTES=$IFI_OBYTES_12][IFI_OPACKETS=$IFI_OPACKETS_10][Flags > =$Flags_1][I > FI_IQDROPS=$IFI_IQDROPS_13][SPM_RegioId=$SPM_RegioId_2][IFI_IBYTES > =$IFI_IBYT > ES_11][PKTS_DROP_ERR=$PKTS_DROP_ERR_15][SPM_TRUNKID > =$SPM_TRUNKID_8][SUB_Inst > ance=$SUB_Instance_4][SPM_DeviceId=$SPM_DeviceId_3][SPM_IspId= > $SPM_IspId_6] > [ TimeStamp=$TimeStamp_7][IFI_OQDROPS=$IFI_OQDROPS_14]]"><xsl:for-each > select="/document/record/MULTICAST_IN_PKTS[not(preceding::MULT > ICAST_IN_PKTS > = > .)]"> > <xsl:variable name="MULTICAST_IN_PKTS_16" select="."/> > <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS > =$IFI_IPACKETS_9] [MULTICAST_IN_PKTS=$MULTICAST_IN_PKTS_16][IFI_OBYTES > =$IFI_OBYTES_12][IFI_OPA > CKETS=$IFI_OPACKETS_10][Flags=$Flags_1][IFI_IQDROPS > =$IFI_IQDROPS_13][SPM_Reg > ioId=$SPM_RegioId_2][IFI_IBYTES=$IFI_IBYTES_11][PKTS_DROP_ERR > =$PKTS_DROP_ERR > _15][SPM_TRUNKID=$SPM_TRUNKID_8][SUB_Instance=$SUB_Instance_4] > [SPM_DeviceId > = > $SPM_DeviceId_3][SPM_IspId=$SPM_IspId_6][TimeStamp > =$TimeStamp_7][IFI_OQDROPS > =$IFI_OQDROPS_14]]"> > <xsl:for-each > select > ="/document/record/MULTICAST_OUT_PKTS[not(preceding::MULTICAST_OUT_PKT > S=.)]"><xsl:variable name="MULTICAST_OUT_PKTS_17" > select="."/> <xsl:choose> <xsl:when > test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS > =$IFI_IPACKETS_9] [MULTICAST_IN_PKTS=$MULTICAST_IN_PKTS_16][IFI_OBYTES > =$IFI_OBYTES_12][IFI_OPA > CKETS=$IFI_OPACKETS_10][Flags=$Flags_1][MULTICAST_OUT_PKTS > =$MULTICAST_OUT_PK > TS_17][IFI_IQDROPS=$IFI_IQDROPS_13][SPM_RegioId=$SPM_RegioId_2 > ][IFI_IBYTES > =$ > IFI_IBYTES_11][PKTS_DROP_ERR=$PKTS_DROP_ERR_15][SPM_TRUNKID=$S > PM_TRUNKID_8] > [ > SUB_Instance=$SUB_Instance_4][SPM_DeviceId=$SPM_DeviceId_3][SPM_IspId > =$SPM_I > spId_6][TimeStamp=$TimeStamp_7][IFI_OQDROPS=$IFI_OQDROPS_14]]"> > <xsl:element name="record"><xsl:element > name="Flags"><xsl:value-of > select="$Flags_1"/></xsl:element><xsl:element > name="SPM_RegioId"><xsl:value-of > select="$SPM_RegioId_2"/></xsl:element> > <xsl:element name="SPM_DeviceId"><xsl:value-of > select="$SPM_DeviceId_3"/></xsl:element> > <xsl:element name="SUB_Instance"><xsl:value-of > select="$SUB_Instance_4"/></xsl:element> > <xsl:element name="SPM_SubId"><xsl:value-of > select="$SPM_SubId_5"/></xsl:element> > <xsl:element name="SPM_IspId"><xsl:value-of > select="$SPM_IspId_6"/></xsl:element> > <xsl:element name="TimeStamp"><xsl:value-of > select="$TimeStamp_7"/></xsl:element> > <xsl:element name="SPM_TRUNKID"><xsl:value-of > select="$SPM_TRUNKID_8"/></xsl:element> > <xsl:element name="IFI_IPACKETS"><xsl:value-of > select="$IFI_IPACKETS_9"/></xsl:element> > <xsl:element name="IFI_OPACKETS"><xsl:value-of > select="$IFI_OPACKETS_10"/></xsl:element> > <xsl:element name="IFI_IBYTES"><xsl:value-of > select="$IFI_IBYTES_11"/></xsl:element> > <xsl:element name="IFI_OBYTES"><xsl:value-of > select="$IFI_OBYTES_12"/></xsl:element> > <xsl:element name="IFI_IQDROPS"><xsl:value-of > select="$IFI_IQDROPS_13"/></xsl:element> > <xsl:element name="IFI_OQDROPS"><xsl:value-of > select="$IFI_OQDROPS_14"/></xsl:element> > <xsl:element name="PKTS_DROP_ERR"><xsl:value-of > select="$PKTS_DROP_ERR_15"/></xsl:element><xsl:element > name="MULTICAST_IN_PKTS"><xsl:value-of > select="$MULTICAST_IN_PKTS_16"/></xsl:element><xsl:element > name="MULTICAST_OUT_PKTS"><xsl:value-of > select="$MULTICAST_OUT_PKTS_17"/></xsl:element></xsl:element> > </xsl:when> > </xsl:choose> > </xsl:for-each> > </xsl:if> > </xsl:for-each> > </xsl:if></xsl:for-each></xsl:if></xsl:for-each></xsl:if> > </xsl:for-each></xsl:if></xsl:for-each></xsl:if></xsl:for-each > ></xsl:if> > </xsl:for-each></xsl:if></xsl:for-each></xsl:if></xsl:for-each > ></xsl:if> > </xsl:for-each></xsl:if></xsl:for-each></xsl:if></xsl:for-each > ></xsl:if> > </xsl:for-each> > </xsl:if></xsl:for-each></xsl:if></xsl:for-each></xsl:if></xsl > :for-each> > </xsl:template></xsl:stylesheet> > > > And sample data record is: > > <record><Flags>0</Flags> > <SPM_RegioId>1</SPM_RegioId> > <SPM_DeviceId>1</SPM_DeviceId> > <SUB_Instance>-1</SUB_Instance> > <SPM_SubId>-1</SPM_SubId> > <SPM_IspId>6</SPM_IspId> > <TimeStamp>Thu Jul 04 17:40:30 EDT 2002</TimeStamp> > <SPM_TRUNKID>45</SPM_TRUNKID> > <IFI_IPACKETS>113</IFI_IPACKETS> > <IFI_OPACKETS>219</IFI_OPACKETS> > <IFI_IBYTES>7002</IFI_IBYTES> <IFI_OBYTES>13038</IFI_OBYTES> > <IFI_IQDROPS>0</IFI_IQDROPS> <IFI_OQDROPS>0</IFI_OQDROPS> > <PKTS_DROP_ERR>0</PKTS_DROP_ERR> > <MULTICAST_IN_PKTS>6760</MULTICAST_IN_PKTS> > <MULTICAST_OUT_PKTS>0</MULTICAST_OUT_PKTS> > </record> > > I know that the sylesheet is not efficient. That is because I > am generating it programmatically, not by hand so that I can > customize it to each type of input. Any help on making it > more effieicnt would be greatly appreciated. What can I use > instead of preceding:: axis? > > Thank you so much! > Malia > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] need help on stylesheet effic, Malia Zaheer | Thread | [xsl] Problem Using Javascript with, Kuhns Matt |
RE: [xsl] Assigning a dynamic value, bryan | Date | [xsl] Problem Using Javascript with, Kuhns Matt |
Month |