unitDictionay additions made should we do a supplemental release?

Scott Chapal scott.chapal at jonesctr.org
Wed Mar 26 11:51:01 PST 2003


It's good that everyone see this interchange.  It stumped me at first
too.

All editors are not created equal.  If you work on the unit
dictionary, be sure your editor understands UTF-8 and doesn't
translate char sets.

-Scott

Matt Jones <jones at nceas.ucsb.edu> writes:
> Dan,
> 
> Actually, we made an explicit decision to use a UTF-8 encoding for the
> eml-unitDictionary.xml file specifically so that we could use
> non-ascii characters like superscripts, degree symbols, and others
> that are common symbols for units.  Any compliant XML parser should be
> able to deal fine with UTF-8 or UTF-16 and other unicode character
> sets, and I think Morpho needs to be adjusted to accept any character
> that is legal in a legal XML document.
> 
> 
> Matt
> 
> Dan Higgins wrote:
> > Scott,
> >    I was looking over your message and noted your use of "m²" . It
> > is of interest that the superscript '²' is not a standard ASCII
> > character (i.e. the upper bit of its 8-bit representation is set,
> > while most standard ASCII uses only the lower 7 bits). This may not
> > be a problem in most cases, but we ran into a similar issue with the
> > special character for 'degrees' in Morpho with some unicode/Java not
> > recognizing such high order bit characters. (I think the main
> > problem was with the Xalan XSLT processor not working when such
> > special characters were in the document that was being transformed.)
> 
> >    It took us a good bit of effort to diagnose the problem, and we
> > are recommending that any special 8 bit characters should be
> > avoided. So just a word of warning and a suggestion that maybe we
> > should avoid such characters in EML docs.
> 
> > Dan Higgins

-- 
\SEC



More information about the Eml-dev mailing list