Subject: RE: [xsl] Sorting Upper-Case first. Microsoft bug? From: "John Marshall" <John.Marshall@xxxxxxxxxxxxxx> Date: Fri, 8 Aug 2003 08:56:07 +0100 |
Dr. Johnson and every lexicographer since has used case as the least significant, most rapidly varying element in ordering. The example I have in front of me from the Concise Oxford Dictionary lists daily - Dalmatian - dalmatic and I would not expect it to do anything else. When Dennis Ritchie devised C before 1978, strcmp() would give a sort order that would place Dalmatian first (assuming ASCII) but in those days most of us were still using uppercase-only i/o devices and not worried about such refinements. If we were, we used strcmpi(). The world has moved on and the whole thrust of Unicode is to coerce the mechanical representation of text into natural linguistic usage, so Dr. Johnson wins. There will be all sorts of interesting issues that arise in considering the natural ordering of words from different linguistic groups, not borrowings like yacht and pyjama, but with equal cultural weight. I suspect you are in a minority of one and the unanimity of the XSLT processors suggests that the interpretation they have adopted is the correct one. John Marshall Accurate Software 80 Peach Street, Wokingham, Berkshire, RG40 1XH, UK. Tel: +44 (0)118 977 3889 Fax: +44 (0)118 977 1260 http://www.accuratesoftware.com <http://www.accuratesoftware.com> -----Original Message----- From: David Carlisle [mailto:davidc@xxxxxxxxx] Sent: 07 August 2003 21:40 To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: RE: [xsl] Sorting Upper-Case first. Microsoft bug? I've been re-reading the W3C Recommendation and although I still think the definition for the case-order sorting given in it is misleading to say the least, I have to recognise that if you keep reading the EXAMPLE given in it clearly states that given A,B,a and b the sorting (Upper-case first) would be A,a,B,b And that's probably why all the implementators followed this rule. According to the W3C Recommendation: But implementers are not following that rule. The rule given is that the case order affects the ording of characters and that strings are ordered lexicographically based on that ordering. This is _not_ what the implementations are doing. Mike kindly gave the algorithm used in saxon 6. Given the length of time its been used, I fear its to late to change it, but I can't see any reading of the W3C rec that could justify such an algorithm. Well actually there is one reading (as Mike pointed out) you could assume that sorted lexicographically was being used as a colloquial turn of phrase rather that specifying lexicographic ordering, but I don't really see any justification for that (and it certainly never occured to me before this thread that systems would do that) David XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list Accurate Software info@xxxxxxxxxxxxxxxxxxxx www.accuratesoftware.com Europe . North America . Australasia . Africa The information in this email is confidential and privileged and is intended only for the use of the individual or entity listed above. If you are neither the intended individual, or entity listed above, nor the person responsible for the delivery of this email to the intended recipients, you are hereby notified that any unauthorised distribution, copying or use of this email is prohibited. If you have received this email in error, please notify the Accurate system manager at postmaster@xxxxxxxxxxxxxxxxxxxx or on +44 (0)118 977 3889. The views expressed in this communication may not necessarily be the views held by the Accurate Group. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Sorting Upper-Case first., David . Pawson | Thread | Re: [xsl] Sorting Upper-Case first., David Carlisle |
RE: [xsl] Sorting Upper-Case first., David . Pawson | Date | Re: [xsl] external function call, Peter_Ivan |
Month |