Subject: RE: more on XSLT processor performance From: "Paulo Gaspar" <paulo.gaspar@xxxxxxxxxxxx> Date: Wed, 2 Aug 2000 13:21:29 +0200 |
> -----Original Message----- > From: owner-xsl-list@xxxxxxxxxxxxxxxx > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxx]On Behalf Of Kay Michael > > Compiling won't solve the memory problem. If we're going to make XSLT > processing of such large files practical, the only way we'll do it is by > using persistent storage rather than memory for the tree. I suggested in the Apache's Xalan-J-Dev mailling list the use of indexed persistent storage. This is the relevant bit: If you are talking indexed XML, I also believe so. I have several ideas on indexing XML for XPath access, but the trouble is always to know what to index. For me, a funny transform cycle concept is: 1. Analyze the XSLT source and figure out what kind of (XPath) selections from a source document are necessary in order to get all the nodes required for the transformation; 2. Pre-parse the document indexing only the parts found to be relevant on 1. One should end up with index information much smaller than the full XML source - small enough to fit in memory; 3. Use a "XLocator" that knows how to use this index to perform the XSLT transformation. Example of "parts found to be relevant": if you find that the XSLT only causes the selection of some elements from the XML source, than only the location of those elements should be indexed. If you use this idea to transform a XML stream, you need to save that XML (or maybe only relevant parts of it) to temporary disk storage an build the index information. Only than you proceed generating the output stream. (For the most generic cases. I am not considering that some cases could be handled on the fly, as already mentioned in this list.) In cases where one has a XSLT that gets a small amount of data from a very big XML file, this approach can be faster than trying to build a DOM: - A full pass is always necessary, but then you only re-read a small amount of data (thanks to the indexing); - Even during the full pass, full paRsing of the file can be avoided; - Creating an index can require much less processing than creating a DOM; - Since the index requires less memory use, Virtual Memory use is avoided (less disk swapping). I know my language is not formaly correct, but... ...does this make sense? Have fun, Paulo Gaspar XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: more on XSLT processor performa, Kay Michael | Thread | RE: more on XSLT processor performa, Thorbjørn Ravn Ander |
Help : Problem with CDATA section, Masaoud T. Moonim | Date | RE: Including files into a styleshe, Kay Michael |
Month |