[Fwd: Re: [seek-kr-sms] ontology/folder UI design in Kepler]

Tue Mar 1 14:14:54 PST 2005

Gee, I thought Mark and you (as KR and SMS leads) were responsible for this :-)
 We'll see what the project manager says...  My sense of it is that no one
really has a good feel for what, explicitly, is needed.  I am hoping that as we
look at actual datasets in Davis, we can start making a list of the things that
are needed, and maybe sketch out some frameworks.  Like I said, I can recruit
people to help fill in the details once I know what is needed.

I don't think there is anything necessarily to "be done about" the concepts I've
worked on.  I think Mark and Shawn have seen most things that I've done.  They
were all done in cmap; they are too general to be appropriate for linking data
to, except perhaps at a high discovery level.  I'll bring everything to Davis,
so we'll have it if it becomes useful in some way.  

Deana

Quoting Bertram Ludaescher <ludaesch at ucdavis.edu>:

> Deana:
> 
> This is a good idea, ie, put some time aside at the Davis meeting to
> do some (initial) ontology creation.
> 
> However one thing that confuses me: who is (on the domain side!)
> providing the leadership on what ontologies are being developed!?
> 
> Let me ask the project manager ;-)
> 
> Matt: 
> Who is in charge of determining what community ontologies are needed
> for SEEK? A measurement ontology is a nice case study, but what do the
> scientists need to get more science done?
> 
> Deana: I haven't seen what you sent earlier. Can you (re-?)send to
> SEEK-kr-sms and we can see what can be done about it?
> 
> cheers
> 
> Bertram
> 
> 
> 
> 
> Deana D. Pennington writes:
>  > 
>  > This sounds simple, but only 2 weeks ago I put together some biomass
> terms in a
>  > hierarchical way that I thought made sense, and Shawn completely
> changed the way
>  > it was organized (you'll have to get the details from him...it was
> all I could
>  > do to follow his logic :-)  So, the problem is that the simple
> examples never
>  > seem to fit when you sit down to do these things.  Perhaps once we
> get a fairly
>  > good set of basic concepts into the ontology, maybe it'll be this
> simple.  I'm
>  > hoping that we will work through some examples in Davis, because no,
> BEAM (that
>  > would be me) is not constructing ontologies on their (my) own.  I
> looked at
>  > Rich's ontologies this summer, and played with some concept maps for
> the
>  > biodiversity case, but I think we need to be much more explicit about
> what is
>  > needed.  I could (and have) spent a good deal of time coming up with
> things that
>  > aren't particularly useful.  The only meetings that BEAM is now
> having are in
>  > conjunction with other teams (like the KR/SMS/BEAM meeting in Davis),
> so if you
>  > want ecologists to develop these, you need to work it in the agenda. 
> If we get
>  > to the point where I have a good sense of exactly what ontologies are
> needed,
>  > then I would be happy to recruit some ecologists for a working
> meeting focused
>  > on ontology generation, as long as there is someone there who can
> ensure that
>  > what we come up with fits the formal requirements.
>  > 
>  > Deana
>  > 
>  > 
>  > 
>  > Quoting Bertram Ludaescher <ludaesch at ucdavis.edu>:
>  > 
>  > > 
>  > > Hi Deana et al:
>  > > 
>  > > I guess it's a good time to chime in now, including to throw in
> some
>  > > thoughts from a KE and SEEK SMS (and KR) point of view =B-)
>  > > 
>  > > First, the intuitive ontology tool that Deana asked about is
> already
>  > > there (although Shawn and I haven't done a particular good job of
>  > > making it available yet). It's the goose feather, oops, I meant
> the
>  > > Sparrow language ;-)
>  > > 
>  > > Here is what we had in mind: in our context, creating
> domain-specific
>  > > ontologies is primarily about defining controlled vocabularies.
> We
>  > > might use fancy GUIs for it, or just the keyboard to type in the
>  > > controlled vocabulary terms:
>  > >   measurement.
>  > >   species_abundance.
>  > >   biodiversity_index.
>  > >   biomass_measurement.
>  > > 
>  > > ok, you can see how a keyboard is useful for entering controlled
>  > > vocabulary terms (unlike, e.g., a mouse). From this to a "formal"
>  > > ontology (at least in the sense used earlier in this thread) it is
> a
>  > > small step. E.g., you might want to say that
>  > > 
>  > >   "biomass measurements are measurements"
>  > > 
>  > > Then we should be able to say just that. In fact that's almost
> valid
>  > > Sparrow syntax; exactly we would write (note the small
> differences):
>  > > 
>  > >   biomass_measurement ISA measurement.
>  > > 
>  > > This is human readable, easy to enter, edit, exchange and parse
> as
>  > > well. Oh, and it can be translated easily into OWL so that other
> tools
>  > > can work with it.
>  > > 
>  > > A slightly more complex example is:
>  > > 
>  > >   "biomass measurements are measurements that only measure
> biomass"
>  > > 
>  > > In Sparrow (goose feather ;-) syntax we would have to slightly
> tweak
>  > > this into the following form:
>  > > 
>  > >   biomass_measurement ISA measurement AND item_measured ONLY
> biomass.
>  > > 
>  > > So what Shawn meant earlier about "formal ontologies" is this level
> of
>  > > refinement/sophistication that we are after. The good thing about
> the
>  > > notation above (as opposed to just nodes and arrow diagrams) is
> that
>  > > these statements can be translated into OWL (or directly
> first-order
>  > > logic) and provide some constraints about how  a biomass
> measurement
>  > > relates to measurements in general, what is being measured etc.
>  > > 
>  > > We also need to keep in mind, as was mentioned earlier, that
> ontology
>  > > creation is not what the typical end user of Kepler should be
> doing. 
>  > > Ontologies are meant to be created by the *scientific community*
> (in
>  > > this case ecologists) using specialized tools and this process is
>  > > often done in concert with the (in)famous KE / KR/CS types
>  > > who can provide additional tools to check the consistency of so
>  > > defined
>  > > ontologies, visualize the inferred class hierarchy etc. 
>  > > 
>  > > But the (passed around) buck does stop there: at the willingness
> of
>  > > the community to actually come up with controlled vocabularies
> and
>  > > simple, but somewhat formal (in the above sense) ontologies.
>  > > So I think we do need to create more of those ontologies. Isn't
> BEAM
>  > > doing that? Or who else? Someone should.. we can certainly help.
>  > > 
>  > > Let's also see how Kepler and ontologies and all that good stuff
>  > > relates (some of this was suggested before, eg. by Laura):
>  > > 
>  > > - Kepler is NOT the primary tool to develop, change, update
>  > > ontologies. For that there is GrOWL, Protege, and yes Sparrow
> (it's
>  > > built into any OS.. we just need to send you again the simple
> grammar)
>  > > 
>  > > - Kepler should have mechanisms to *annotate* actors (as a
> whole),
>  > > their ports, and datasets. This should normally NOT require changes
> to
>  > > the ontology. Rather one would simply *select* concept names from
> one
>  > > of possibly several pre-defined community ontologies (importing
>  > > different ontologies into Kepler should be easy)
>  > > 
>  > > - For expert users only: If you feel that you need to define a
> new
>  > > concept on top of an existing ontology, you should be able to do
> that
>  > > as well from Kepler. Clearly, you are not updating the community
>  > > ontology (there is a "protocol" for the latter, one needs
> endorsement
>  > > from a consortium such as "Eco-GO" ;-) but rather you are adding
> to
>  > > your "private ontology extensions" much like you would add new
> words
>  > > to your private dictionary in Word ("Add foobar to your private
>  > > ontology (y/n)?"). My earlier suggestion would be to be able to
> define
>  > > a new concept in this way, e.g.
>  > > 
>  > > my_measurement ISA biomass_measurement AND hasLocation davis.
>  > > 
>  > > Then my_measurement and davis might be new concepts which are added
> to
>  > > my personal ontology (along with the Sparrow statement/axiom
> above),
>  > > whereas the other terms come from the existing ontology.
>  > > 
>  > > 
>  > > OK, so much for now. 
>  > > 
>  > > cheers and laters
>  > > 
>  > > Bertram
>  > > 
>  > > 
>  > > 
>  > > 
>  > > Deana D. Pennington writes:
>  > >  > 
>  > >  > Shawn,
>  > >  > 
>  > >  > I think we can use our own experiences to clarify what an
> ecologist
>  > > might or
>  > >  > might not be able to do, at least in the near term.  In the few
> times
>  > > that I've
>  > >  > tried to organize a set of terms into an "ontology" for you, and
> in
>  > > the times
>  > >  > I've watched the postdocs/faculty try to do it, none of us have
> ever
>  > > given you
>  > >  > anything remotely close to what you needed.  That's definitely
> a
>  > > concern, if
>  > >  > we're going to have the ecologists do it themselves. I honestly
> don't
>  > > think
>  > >  > you're going to get what you want from most ecologists without
>  > > substantial
>  > >  > training.  I think the likelihood of them going after such
> training
>  > > is small. 
>  > >  > Its telling that many of the people at this last workshop took
>  > > offense when I
>  > >  > suggested that it would be a good idea for them to learn how
> to
>  > > program...and
>  > >  > the usefulness of that should have been much more obvious to
> them
>  > > than the
>  > >  > usefulness of creating ontologies. I think we are a long way
> from
>  > > having a
>  > >  > community of ecologists who have the skills or desire to
> invest
>  > > considerable
>  > >  > effort learning how to do this.  Perhaps eventually we will
> develop a
>  > > community
>  > >  > of ecoinformatics people who are more on the domain side than
> the IT
>  > > side, who
>  > >  > can learn how to do this and work at the interface.  For the
> short
>  > > term, I don't
>  > >  > see any way around having a "knowledge engineer" work with the
>  > > ecologists.  But,
>  > >  > I reserve the right to change my mind when you demonstrate an
>  > > ontology tool that
>  > >  > is, in fact, easy to use for a domain person :-)
>  > >  > 
>  > >  > Deana
>  > >  > 
>  > >  > 
>  > >  > Quoting Shawn Bowers <sbowers at ucdavis.edu>:
>  > >  > 
>  > >  > > 
>  > >  > > Some comments:
>  > >  > > 
>  > >  > > Laura L. Downey wrote:
>  > >  > > >>Shawn writes:
>  > >  > > >>I think that this is (or at least was) exactly one of the
>  > > "missions"
>  > >  > > in 
>  > >  > > >>SEEK: to get scientists involved in creating and using
> *formal*
>  > >  > > ontologies.
>  > >  > > > 
>  > >  > > > 
>  > >  > > > Using formal ontologies, yes.  I have definitely seen some
>  > > excitement
>  > >  > > when
>  > >  > > > semantic mediation has been talked about in a way that will
> make
>  > > their
>  > >  > > jobs
>  > >  > > > easier -- of finding other data sets they would not
> otherwise
>  > > have
>  > >  > > found,
>  > >  > > > when identifying actors that would be useful to them that
> they
>  > >  > > otherwise
>  > >  > > > might not have identified etc.  And yes, creating the
>  > > ontologies
>  > >  > > themselves
>  > >  > > > too, because they know their domains better than we do,
> but
>  > > formally
>  > >  > > > specifying them so that machines can make use of them? I'm
> not so
>  > > sure
>  > >  > > about
>  > >  > > > that from what I've seen.  But again, remember I'm new to
> the
>  > > project
>  > >  > > so
>  > >  > > > bringing an outsider perspective and maybe one that needs to
> be
>  > > more
>  > >  > > > informed.
>  > >  > > 
>  > >  > > I think "formally specifying ontologies" is a loaded phrase
> ... it
>  > > is 
>  > >  > > being used to refer to the languages (such as OWL) and tools
> (such
>  > > as 
>  > >  > > Protege) that have known deficiencies not only for "domain
>  > > scientists"
>  > >  > > 
>  > >  > > but also in general for capturing knowledge. OWL is a W3C
>  > > specification
>  > >  > > 
>  > >  > > that is based on XML and is overly verbose (being expressed in
> XML)
>  > > and
>  > >  > > 
>  > >  > > often misused. It is really just an interchange format, and
> not
>  > > really a
>  > >  > > 
>  > >  > > language unto itself (it's meant to encompass many languages
> so as
>  > > to be
>  > >  > > 
>  > >  > > a good middle-ground for tools that use disparate languages).
> 
>  > > Protege
>  > >  > > 
>  > >  > > is a tool that is still young and is just starting to be
> more
>  > > widely 
>  > >  > > used. It is, however, in many ways still designed for a very
> small,
>  > > 
>  > >  > > highly technical user group.
>  > >  > > 
>  > >  > > Ontology tools should be such that they present a sound and
>  > > intuitive 
>  > >  > > user model (i.e., the conceptual constructs used to create
>  > > ontologies),
>  > >  > > 
>  > >  > > shielding the user from the underlying interchange format.
> Most
>  > > tools 
>  > >  > > that are out there essentially present a low-level graphical
>  > > version of
>  > >  > > 
>  > >  > > the language, not of these higher-level conceptual constructs.
> A
>  > > counter
>  > >  > > 
>  > >  > > example is CMAP, however, it's model in my opinion is too
>  > > unconstrained,
>  > >  > > 
>  > >  > > and offers little support in terms of helping users to create
> 
>  > >  > > well-designed and "consistent" ontologies.
>  > >  > > 
>  > >  > > I also think this notion that a domain scientist will
> "informally"
>  > > 
>  > >  > > construct an ontology and then pass it off to a "knowledge
>  > > engineer" to
>  > >  > > 
>  > >  > > "make it formal" is (a) not a scalable solution, (b) "passes
> the
>  > > buck"
>  > >  > > 
>  > >  > > to an unknown entity (i.e., the non-existent "knowledge
>  > > engineers"), and
>  > >  > > 
>  > >  > > (c) in general, is not always a sound approach.  (I'm not
> picking
>  > > on you
>  > >  > > 
>  > >  > > here Laura -- these are just some observations; and I'm trying
> to
>  > > 
>  > >  > > stimulate some discussion as to what the approach should be
> for
>  > > SEEK.)
>  > >  > > 
>  > >  > > I think in SEEK, this notion of a knowledge engineer has been
> used
>  > > in 
>  > >  > > place of providing useful tools to our users.  I think if
> anything,
>  > > the
>  > >  > > 
>  > >  > > "knowledge engineer" should be built into the tool -- which
> is
>  > > starting
>  > >  > > 
>  > >  > > to emerge in some other tools, including protege.
>  > >  > > 
>  > >  > > I think that the challenge in defining a "formal ontology" for
> a 
>  > >  > > particular domain is that as a user: (1) you need to have a
> clear
>  > > 
>  > >  > > understanding of the domain, the concepts, their definitions
> (very
>  > > 
>  > >  > > challenging in general), and (2) you need to understand how
> to
>  > > represent
>  > >  > > 
>  > >  > > this information in the knowledge representation
> language/tool.  If
>  > > a 
>  > >  > > domain scientist gives the knowledge engineer the first item
> (1),
>  > > then
>  > >  > > 
>  > >  > > the scientist could have just as well input the information in
> a 
>  > >  > > well-designed ontology tool. If the knowledge engineer gives
> a
>  > > vague and
>  > >  > > 
>  > >  > > imprecise description of (1), then the knowledge engineer has
> no
>  > > chance
>  > >  > > 
>  > >  > > of doing (2).  My argument is that to "create ways for
> regular
>  > > users to
>  > >  > > 
>  > >  > > provide the appropriate input to the knowledge engineers so
> that
>  > > items
>  > >  > > 
>  > >  > > are formally specified" essentially means that the "regular
> users"
>  > > have
>  > >  > > 
>  > >  > > already specified the ontology -- and they don't need the KE
> (of
>  > > course
>  > >  > > 
>  > >  > > this could be an iterative process, where the KE "holds the
> hand"
>  > > of the
>  > >  > > 
>  > >  > > scientist through the process -- which is again not going to
> scale
>  > > and
>  > >  > > 
>  > >  > > is probably not that practical).
>  > >  > > 
>  > >  > > Of course, not only do we want to make (2) easy, we also want
> tools
>  > > to
>  > >  > > 
>  > >  > > help scientists/users get to (1). I think there are lots of
> ways to
>  > > help
>  > >  > > 
>  > >  > > users get to (1), e.g., by:
>  > >  > > 
>  > >  > > - describing a process/methodology, like in object-oriented
>  > > analysis and
>  > >  > > 
>  > >  > > design that can help one go from a fuzzy conceptualization to
> a
>  > > clearer
>  > >  > > 
>  > >  > > model (we want to target scientists, however, instead of
> software
>  > > 
>  > >  > > designers/developers)
>  > >  > > 
>  > >  > > - providing tools to help people "sketch" out their ideas
> before 
>  > >  > > committing to an ontology language (but make it explicit that
> they
>  > > are
>  > >  > > 
>  > >  > > doing the "sketch" as part of a process) ... e.g., by allowing
> some
>  > > 
>  > >  > > free-text definitions mixed with class and property defs, etc.
> 
>  > >  > > Essentially, provide a tool that can facilitate someone to go
> from
>  > > 
>  > >  > > informal/unclear to formal/clear.
>  > >  > > 
>  > >  > > - adopting some known approaches for "cleaning up" an
> ontology
>  > > (similar
>  > >  > > 
>  > >  > > to OntoClean, e.g.)
>  > >  > > 
>  > >  > > - providing tools that can identify inconsistencies and
> possible 
>  > >  > > "pitfalls" in the ontology (useful for getting to a clearer,
> more
>  > > formal
>  > >  > > 
>  > >  > > model)
>  > >  > > 
>  > >  > > - providing lots of examples of "well-defined" ontologies
>  > >  > > 
>  > >  > > - letting people edit and reuse existing well-formed
> ontologies (in
>  > > 
>  > >  > > fact, I think that once we have a basic framework, this will
> be the
>  > > 
>  > >  > > typical model of interaction for many scientists ...  )
>  > >  > > 
>  > >  > > 
>  > >  > > In terms of "machine understandable ontologies", this really
> just
>  > > means
>  > >  > > 
>  > >  > > that the ontology is captured in one of these ontology
> languages,
>  > > like
>  > >  > > 
>  > >  > > OWL.  It doesn't mean that a scientist should have to
> literally put
>  > > 
>  > >  > > their ontology into this language -- that is the job of the
> tool.
>  > > Our 
>  > >  > > goal should be to help users specify ontologies using
> "structured"
>  > > 
>  > >  > > approaches.  That is, essentially in restricted languages that
> are
>  > > not
>  > >  > > 
>  > >  > > as ambiguous and not as unconstrained as natural language --
> which
>  > > is 
>  > >  > > typically done using graphical tools (box and line diagrams).
> 
>  > > Also, the
>  > >  > > 
>  > >  > > user should be completely unaware that their definitions are
> being
>  > > 
>  > >  > > stored in these low-level languages; which is why the
> existing
>  > > tools 
>  > >  > > fail for domain scientists / non computer-science folks.
>  > >  > > 
>  > >  > > 
>  > >  > > 
>  > >  > > 
>  > >  > > > Is the goal here to figure out a way to allow scientists
> with
>  > > no
>  > >  > > formal
>  > >  > > > ontology experience to easily specify formal ontologies in
> a
>  > > way
>  > >  > > that
>  > >  > > > machines can make use of them?  That seems like a daunting
> task
>  > > to me
>  > >  > > -- and
>  > >  > > > one that would require considerable time and resources. 
> Didn't I
>  > > just
>  > >  > > read
>  > >  > > > from Mark (in the IRC convo) that the knowledge engineers
>  > > themselves
>  > >  > > have
>  > >  > > > trouble with their own tools like Protégé?  Creating and
>  > > specifying
>  > >  > > formal
>  > >  > > > ontologies is a complex and challenging job even for those
>  > > trained in
>  > >  > > it.
>  > >  > > > 
>  > >  > > > I agree that scientists understand their domains better
> than
>  > > others,
>  > >  > > but
>  > >  > > > that doesn't mean they understand how to formally
> represent
>  > > that
>  > >  > > domain in a
>  > >  > > > way that can be utilized by a machine.  They user their
> own
>  > >  > > experience,
>  > >  > > > intuition, and knowledge to create ontologies.  They make
>  > > decisions
>  > >  > > and
>  > >  > > > understand possible exceptions.  But that is a different
> task
>  > > than
>  > >  > > formally
>  > >  > > > specifying that ontology to a rigid set of rules that can
> be
>  > > utilized
>  > >  > > via
>  > >  > > > machine processing.  I'm thinking that is still a task to be
> done
>  > > by
>  > >  > > a
>  > >  > > > trained knowledge engineer.
>  > >  > > > 
>  > >  > > > And if we create ways for regular users to provide the
>  > > appropriate
>  > >  > > input to
>  > >  > > > the knowledge engineers so that items are formally specified
> in
>  > > such a
>  > >  > > way
>  > >  > > > that the system can make use of them to the benefit of the
>  > > regular
>  > >  > > users, I
>  > >  > > > would see that as a definite win and demonstration of the
> power
>  > > of
>  > >  > > semantic
>  > >  > > > mediation to make scientists jobs easier.
>  > >  > > > 
>  > >  > > > Laura L. Downey
>  > >  > > > Senior Usability Engineer
>  > >  > > > LTER Network Office
>  > >  > > > Department of Biology, MSC03 2020
>  > >  > > > 1 University of New Mexico
>  > >  > > > Albuquerque, NM  87131-0001
>  > >  > > > 505.277.3157 phone
>  > >  > > > 505.277-2541 fax
>  > >  > > > ldowney at lternet.edu
>  > >  > > >  
>  > >  > > > 
>  > >  > > > 
>  > >  > > > _______________________________________________
>  > >  > > > seek-kr-sms mailing list
>  > >  > > > seek-kr-sms at ecoinformatics.org
>  > >  > > > http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms
>  > >  > > 
>  > >  > > _______________________________________________
>  > >  > > seek-kr-sms mailing list
>  > >  > > seek-kr-sms at ecoinformatics.org
>  > >  > > http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms
>  > >  > > 
>  > >  > 
>  > >  > 
>  > >  > 
>  > >  > **************************
>  > >  > Dr. Deana D. Pennington
>  > >  > Long-term Ecological Research Network Office
>  > >  > 
>  > >  > UNM Biology Department
>  > >  > MSC03  2020
>  > >  > 1 University of New Mexico
>  > >  > Albuquerque, NM  87131-0001
>  > >  > 
>  > >  > 505-277-2595 (office)
>  > >  > 505 272-7080 (fax)
>  > >  > _______________________________________________
>  > >  > seek-kr-sms mailing list
>  > >  > seek-kr-sms at ecoinformatics.org
>  > >  > http://www.ecoinformatics.org/mailman/listinfo/seek-kr-sms
>  > > 
>  > > 
>  > 
>  > 
>  > 
>  > **************************
>  > Dr. Deana D. Pennington
>  > Long-term Ecological Research Network Office
>  > 
>  > UNM Biology Department
>  > MSC03  2020
>  > 1 University of New Mexico
>  > Albuquerque, NM  87131-0001
>  > 
>  > 505-277-2595 (office)
>  > 505 272-7080 (fax)
> 
> 

**************************
Dr. Deana D. Pennington
Long-term Ecological Research Network Office

UNM Biology Department
MSC03  2020
1 University of New Mexico
Albuquerque, NM  87131-0001

505-277-2595 (office)
505 272-7080 (fax)