[tcs-lc] Modularisation of standards - identification of names
Richard Pyle
deepreef at bishopmuseum.org
Tue Mar 8 03:24:48 PST 2005
> One use case that springs to mind is the separation of homonyms,
> particularly where it comes to homonym genera.
That *should* be discernable as long as authorships are included, but I
wonder how often genera authors will be provided? Even species-level
authorships are not consistely provided. I think that relying on a simple
string comparison is too weak.
> In the canonical names part of the Linnean Core we included (I'm
> not sure if it's disappeared in the latest version, but I don't think so)
> scope for a reference attribute in the separate atoms of the names.
I believe these are important, and would be included among the "[...and all
the other LC bits]" in the email I just sent.
> There's a third way (sorry to introduce a note of domestic UK
> politics, but Rich and Nico started it) which is to take the LC
> approach and embed both identifiers and data:
>
> <TaxonConcepts>
> <TaxonConcept id="tc1">
> <Name id="123-1">
> <Label>Aus bus</Label>
> <CanonicalAuthorship>Black, 1965</CanonicalAuthorship>
> </Name>
> <AccordingTo>Smith</AccordingTo>
> </TaxonConcept>
> <TaxonConcept id="tc2">
> <Name id="123-1">
> <Label>Aus bus</Label>
> <CanonicalAuthorship>Black, 1965</CanonicalAuthorship>
> </Name>
> <AccordingTo>Jones</AccordingTo>
> </TaxonConcept>
> </TaxonConcepts>
Where would the "123-1" values point to? Somewhere else internal to the
DataSet package, or an external GUID reference?
> Of course, to a computer the inclusion of the second set of
> information within the name is redundant but we shouldn't
> underestimate the amount of human eyeballing of XML data goes
> on.
More significantly, there needs to be a place for human-readable versions of
the information when the id link is not provided (part of the function of my
NameVerbatim element).
> Also if the name id has an existence outside the transient life
> of the xml document instance (for instance, if it were an id from a
> nomenclator) then the processing power involved in producing that
> sort of document on demand as part of a web service would be
> reduced.
Agreed -- that's why I favor inclusion of the human-readable data within the
package, but nomalized somewhat by an internal reference (with the option of
expanding to an external reference identifier, when Name Registration
eventually comes online).
> Possibly we've been missing a trick in how we implement these
> things, but trying to create a document on the fly using templates
> where we were keeping track of a list of publications, a list of
> vouchers and now a list of names, and put ad hoc ids (1, 2, 3 ...)
> into the main schema referring to the separate lists at the bottom
> of the document was the one thing that made implementing TCS a
> bit of a challenge for IPNI
Where on the "diffculty scale" would the implementation of the approach I
just sent fall?
> One thing about XML that I've found, if you try and approach it with
> an OO programmer hat on and make it enforce business rules,
> then you very quickly get frustrated, or end up with very
> complicated schemas.
So I am learning!!! (Special thanks to Bob Morris for opening my eyes on
this!)
Aloha,
Rich
More information about the Tcs-lc
mailing list