Jim,
At 10:36 AM 12/18/2002, you wrote:
Yes the example I gave was simplified. There can be other elements present
that would exclude a simple text() answer.
But it got me thinking. Is there a way of saying
the intersection of
all nodes following verse with id="BCV-GEN-1.1"
AND
all nodes preceding verseEnd with id="BCV-GEN-1.1-END"
If so that would give me all the nodes contained within verse.
This can be done, although it's painful in XPath 1.0.
The problem is that you need to know not only which nodes follow the start
verse and precede the end verse; you also have to account for splitting at
arbitrary places....
Here, for example, the first <verse/> is followed by an <s> element which
contains <seg> elements; the corresponding <endVerse/> is inside one of
those segs (i.e. two levels inside one of its <verse/> element's siblings).
<quote>
<verse/><s><seg>Of Man's first disobedience,</seg> <seg>and the
fruit<endVerse/>
<verse/>Of that forbidden tree whose mortal taste <endVerse/>
<verse/>Brought death into the World,</seg> <seg>and all our
woe,</seg><endVerse/>
<verse/><seg>With loss of Eden,</seg> <seg>till one greater Man <endVerse/>
<verse/>Restore us,</seg> <seg>and regain the blissful seat,</seg><endVerse/>
<verse/><seg>Sing,</seg> <seg>Heavenly Muse,</seg> <seg>that,</seg> <seg>on
the secret top <endVerse/>
<verse/>Of Oreb,</seg> <seg>or of Sinai,</seg> <seg>didst inspire <endVerse/>
<verse/>That Shepherd who first taught the chosen seed <endVerse/>
<verse/>In the beginning how the heavens and earth <endVerse/>
<verse/>Rose out of Chaos:</seg> <seg>or,</seg> <seg>if Sion hill <endVerse/>
<verse/>Delight thee more,</seg> <seg>and Siloa's brook that flowed <endVerse/>
<verse/>Fast by the oracle of God,</seg> <seg>I thence <endVerse/>
<verse/>Invoke thy aid to my adventurous song,</seg><seg><endVerse/>
<verse/>That with no middle flight intends to soar <endVerse/>
<verse/>Above th' Aonian mount,</seg> <seg>while it pursues <endVerse/>
<verse/>Things unattempted yet in prose or rhyme.</seg></s><endVerse/>
<verse/><s><seg>And chiefly thou,</seg> <seg>O Spirit,</seg> <seg>that dost
prefer <endVerse/>
<verse/>Before all temples th' upright heart and pure,</seg><seg><endVerse/>
<verse/>Instruct me,</seg> <seg>for Thou know'st;</seg> <seg>Thou from the
first <endVerse/>
<verse/>Wast present,</seg> <seg>and,</seg> <seg>with mighty wings
outspread, </seg><seg><endVerse/>
<verse/>Dove-like sat'st brooding on the vast Abyss, </seg><seg><endVerse/>
<verse/>And mad'st it pregnant:</seg> <seg>what in me is dark <endVerse/>
<verse/>Illumine,</seg> <seg>what is low raise and
support;</seg><seg><endVerse/>
<verse/>That,</seg> <seg>to the height of this great
argument,</seg><seg><endVerse/>
<verse/>I may assert Eternal Providence,</seg><seg><endVerse/>
<verse/>And justify the ways of God to men.</seg></s><endVerse/>
</quote>
Etc.
One approach is to go "bottom up". This is what Patrick Durusau and Matthew
O'Donnell, who (AFAIK) have done the most work in public with this problem,
call a "bottom-up virtual hierarchy" (BUVH). One pass flattens *everything*
into milestones; the second interpolates the hierarchy you want. (Actually
this is a simplification of what they did, though I don't see why it
wouldn't work.) This is doable, but quite hairy if you want to preserve any
of the original hierarchy, and so processor intensive that you don't want
to be trying it on large texts. More lately, their efforts have shifted to
an approach they call JITTs ("Just-in-Time Trees"), in which the verse
starts and ends are promoted from atomic milestones into real element
starts and ends. (A pre-XML-parse process then extracts the hierarchy you
want.) While charmingly enunciated, and (I believe) ultimately on the right
track, this approach suffers (IMHO) because it tries to repeal the First
Law of XML Markup: "Thou Shalt be Cleanly Nested", thereby risking
unnecessary Uncertainty and Doubt, if not actually Fear.
(I say ultimately on the right track partly because once it is re-expressed
in a different syntax clearly distinguished from XML, this approach is
almost to LMNL, which is being designed for this sort of thing ... where
XML/XSLT were not.)
For references to Patrick and Matthew's work, type "JITTs" into your
favorite web search engine. It's ongoing.
Cheers,
Wendell
======================================================================
Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list