[tcs-lc] Mispelling db response

Richard Pyle deepreef at bishopmuseum.org
Mon May 2 13:07:03 PDT 2005


> If you were querying a db and requested a wrongly spelled name
> (you don't know it is wrongly spelled you are just passing a
> word) would you expect to get back:
>
> 1) An object for the correctly spelled name - perhaps with the misspelling
as a note.
> 2) An object for the incorrectly spelled name with a pointer to where to
get the correctly spelled object.
> 3) Objects for both the correctly and incorrectly spelled names and a link
between them.

Here's what I would like:

- The "Code-correct" name in bold, at the top.
- A list of orthographic variants, non-bold and indented, with the
particular one I searched for highlighted in some way
- Little "+" symbols next to each orthographic variant that would lead me to
a list of publications that spelled the name in each particular way.
- A separate list of competing subjective "statuses" of the bold name on the
top line (e.g., ITIS treats it as valid, Bishop Museum treats it as a
subjective synonym synonym of a different name, etc.)
- Another list of "unlinked" concepts that nevertheless match the search
criteria.

The point is, I think alternate spellings should fundamentally "hub" around
an unambiguous "NameObject", which then is linked to by a series of concepts
that used any one of a number of orhtographic variants of that name object.

> I think this is a matter for the db you are querying to decide not the
schema.

Agreed!

> Some dbs will be capable of doing some kinds of response others capable of
others.
> The schema is just the transport medium for this kind of thing.

Doubly agreed!

> Does this help in clarifying the debate on misspelling of names at all?
> i.e. which ones of them will be created as objects and which will be notes
etc.

I'm not sure, because the debate is about the transfer schema -- not the way
that user interfaces will present the information.

In my experience, the best solution to a data model is the one that most
closely reflects the "reality" of information.  That statement, of course,
is of limited value, because one person's "reality" is another person's
rubbish.

But here's my broader perspective:

1) The Concept part of the TCS schema should be designed to maximally
accomodate the people who have to deal with datasets that involve
cross-mapping via defined concept circumsrciptions. I think it currently
does this well.

2) The detailed "Name" part of the TCS schema should be designed to
maximally accomodate the needs of the people whose business it is to sort
out taxonomic names (Nomenclators, Code-Warriors, etc.).  This should be the
job of LC.

3) I am of growing confidence that these issues we've been debating will NOT
be sufficiently resolved in time for the next TDWG meeting in St.
Petersburg.

4) Because I think it would be a grave mistake to handicap the
implementation of v1.0 of TCS due to unresolved name issues (the solutions
to which we don't seem much closer to now than we did some time ago), I
would propose we consider the following as a bailout plan:

a) Revert back to TCS v0.95.0;

b) Leave the "NameDetailed" part of that version more or less as it is
(there are still some minor points of dicussion and need for clarification,
but not of the magnitude of the "NameObjects" debate), and think of it as a
"placeholder" for a future, more robust "names as objects" schema (LC), to
be developed separately. In this paradigm, there would be no attempt to
create GUIDs for name-objects -- only concept-objects.  The existing
NameDetailed element structure would be viewed simply as a complex parsed
set of properties of concepts (i.e., the applied name properties) -- and a
place to put basic nomenclatural information attached to concept
definitions.

c) Allow the nomenclaturists to develop a stand-alone LC schema, optimized
for dealing with "Name Objects" as nomenclaturists define them.

d) Eventually, when LC is sorted out and runs through the TDWG standards
adoption process, v2.0 of TCS could add a "ref" attribute to the "Name"
element, allowing linking to separate Name Objects via appropriat UIDs --
and perhaps even dump the "NameDetailed" bit altogether.

I'm not giving up all hope of a "merged" TCS/LC in time for the St.
Petersburg meeting -- but I do want to see established a "safety net" so as
not to impede the implementation of a "mostly harmless" version of TCS v1.0.

Aloha,
Rich

Richard L. Pyle, PhD
Database Coordinator for Natural Sciences
Department of Natural Sciences, Bishop Museum
1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252
email: deepreef at bishopmuseum.org
http://hbs.bishopmuseum.org/pylerichard.html








More information about the Tcs-lc mailing list